DAGWorks-Inc / hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
https://hamilton.dagworks.io/en/latest/
BSD 3-Clause Clear License
1.56k stars 92 forks source link

[good first issue - beginner] Polars Readers & Writers #410

Closed skrawcz closed 2 months ago

skrawcz commented 9 months ago

Is your feature request related to a problem? Please describe. We need to add Polars Readers & Writers (Savers & Loaders in our internal parlance).

Describe the solution you'd like We need to have readers & if appropriate, writers that cover I/O listed here:

Example implementation to mirror can be found here.

Additional context We need to start building wrappers around the common ways people will want to save/load data. That way they'll have off the shelf ways to get onto Hamilton easily.

If you're interested in contributing

If you are interested in contributing, picking up one of the above should be straightforward.

  1. Ask for one, and we'll assign it.
  2. We'll create an issue for you.
  3. We'll then work with you on that issue.

In terms of effort, for an example of a desired class, see this code. It basically involves:

  1. Reading the subsequent documentation.
  2. Creating the right class.
  3. Creating some tests for it.
  4. Creating an example to put into our examples repository.
swapdewalkar commented 9 months ago

@skrawcz I'll take this up!

ghost commented 9 months ago

Is this issue assigned as a whole or can we take up individual components from the list (Parquet, JSON, etc.) similar to #284 ?

skrawcz commented 9 months ago

@skrawcz I'll take this up!

@swapdewalkar let me know which one you want to take.

Is this issue assigned as a whole or can we take up individual components from the list (Parquet, JSON, etc.) similar to https://github.com/DAGWorks-Inc/hamilton/issues/284 ?

@kokobhara correct, like #284. I will break it up into components as people ask for bits.

swapdewalkar commented 9 months ago

@skrawcz I can start with json and parquet.

skrawcz commented 9 months ago

@skrawcz I can start with json and parquet.

Sure! Please comment on #417 and #418 to have me assign it to you.

ghost commented 9 months ago

@skrawcz Can I take this up for Avro?

skrawcz commented 9 months ago

@skrawcz Can I take this up for Avro?

Sure - please comment on https://github.com/DAGWorks-Inc/hamilton/issues/420 so I can assign it to you.

somaiyamansi commented 9 months ago

Hello, I'd like to work on Feather/IPC

skrawcz commented 9 months ago

Hello, I'd like to work on Feather/IPC

@somaiyamansi please comment on https://github.com/DAGWorks-Inc/hamilton/issues/421

swapdewalkar commented 9 months ago

@skrawcz Assign Spreadsheet and Database to me, I have already started working on them.

skrawcz commented 9 months ago

@skrawcz Assign Spreadsheet and Database to me, I have already started working on them.

@swapdewalkar #449 and #450 are yours.