Open ion-elgreco opened 4 months ago
Without taking a position on whether it should be there or not (though it doesn't seem unreasonable), how about read_delta_changes
(or something similar) as the name? "cdf" doesn't seem very descriptive 🤔
Without taking a position on whether it should be there or not (though it doesn't seem unreasonable), how about
read_delta_changes
(or something similar) as the name? "cdf" doesn't seem very descriptive 🤔
True, you would need to know delta to know it. How about read_delta_changefeed
?
True, you would need to know delta to know it. How about
read_delta_changefeed
?
Yup, that would work for me; even clearer ;)
I'm not a big fan of expanding our API surface for third-party integrations like this. If Delta adds 10 more features, do we add 10 more methods?
Perhaps it's sufficient to add an example to the user guide that shows how to read a changelog into Polars using the deltalake
package directly?
@stinodego I get your point, but this is probably the last thing. I was thinking of adding it into read/scan_delta but it probably won't work nicely into one api
An example could work yeah!
Description
@stinodego with python v0.17.3, a change data feed reader got added for deltalake, are you ok with me adding a new method:
pl.read_delta_cdf()
Essentially it would just be a shortcut for
pl.DataFrame(dt.load_cdf(starting_version=<>, ending_version=<>)
https://github.com/delta-io/delta-rs/blob/11ab3f68493d32c620f76c8e33671e626d8f0dde/python/deltalake/table.py#L687-L688