DAGWorks-Inc / hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
https://hamilton.dagworks.io/en/latest/
BSD 3-Clause Clear License
1.88k stars 126 forks source link

Automate Article Collection and README Updates from Blog Source #1208

Closed sikehish closed 3 weeks ago

sikehish commented 1 month ago

This pull request introduces a script for fetching articles from a specified URL(https://blog.dagworks.io/archive), processing the data, and updating the README file with the fetched articles. It addresses issue #1046.

Changes

How I tested this

Checklist

skrawcz commented 1 month ago

Awesome, looking forward to trying this tomorrow / Monday!

Otherwise one request -- could you include a "print to standard out" option please?

sikehish commented 4 weeks ago

Awesome, looking forward to trying this tomorrow / Monday!

Otherwise one request -- could you include a "print to standard out" option please?

Hi @skrawcz. I've added that option. I've also added a docstring in the python script file so that it would be easier for the users to get started with it.

skrawcz commented 4 weeks ago

@sikehish you need to:

  1. squash these changes down to a single commit (this will make rebasing simpler)
  2. rebase.

Then I can merge this. Otherwise tested locally and it works, thank you!

sikehish commented 3 weeks ago

Hi @skrawcz I tried squashing and rebasing all commits...Is this alright? I've created a backup branch where I've cherry picked all the commits relevant to this PR(the 6-7 commits relevant to the issue). I can create another PR from the other branch and close this if this looks messed up.

skrawcz commented 3 weeks ago

Thanks @sikehish !