tecosaur / DataToolkit.jl

Reproducible, flexible, and convenient data management
https://tecosaur.github.io/DataToolkit.jl
78 stars 4 forks source link

Feature request: Arrow loader #30

Closed jfb-h closed 5 months ago

jfb-h commented 7 months ago

A loader for arrow files would be great. I tried to look up how to add a new loader but but it was a bit difficult for me to navigate the codebase. If that's a thing that users could contribute, maybe a tutorial could be helpful.

tecosaur commented 7 months ago

An arrow loader would be good! If you're up for it, I'd actually like to take this as a chance to improve the documentation. Would you be up for giving this a shot if I work with you find out what documentation is needed to make it seem straightforward to implement a new loader?

jfb-h commented 7 months ago

I'd be happy to help where I can but I would probably need a fair bit of hand-holding, as I have not contributed much to open source yet. Right now, I would not really know where to start, could you maybe point me to the implementation of the csv loader?

tecosaur commented 7 months ago

Hmmm, I'll mark that idea down as a "maybe" then. Happy to point you to some example loaders as a starting point though. The CSV one isn't pretty to look at (a bunch of option interpretation code), so I'd draw your attention to json.jl and delim.jl instead.

jfb-h commented 7 months ago

Thanks, the delim loader looks pretty manageable. So load and save is the basic interface? What is the supportedtypes(::Type{DataLoader{:json}) in the json loader for? Thanks for your help, I appreciate it!

tecosaur commented 5 months ago

Implemented in tecosaur/DataToolkitCommon.jl#10, thanks Jakob.