Open alamb opened 1 year ago
If someone wanted to help out the DataFusion project helping with this one would be awesome. A good first step would be to make the skeleton of the topics above in https://github.com/apache/arrow-datafusion/tree/main/docs and leave placeholder text (like "Coming Soon")
Then we can work together on writing the content in a few different PRs
This sounds great, really excited!
We'll either want two user guides or one user guide that's half in Python / half in Rust.
I guess that 99% of the users that want to query data via an API will want to do so in SQL / Python. The Python DataFrame user guide is way more important than the Rust one.
Users leveraging DataFusion to build tools for other engines (e.g. delta-rs) are much more likely to be using Rust.
Perhaps we divide the documentation as follows:
I don't think we should invest in building out the DataFusion Rust DataFrame API docs yet because it's a lower ROI activity. We should build a URL structure that allows for this however.
The Python DataFrame user guide is way more important than the Rust one.
I agree this is more important for "end users" rather than developers who are building with Rust
Perhaps we divide the documentation as follows:
That sounds great -- I filed https://github.com/apache/arrow-datafusion-python/issues/432 to track the work for the python bindings
I filed a bunch of tickets for follow on work and update the description of this ticket https://github.com/apache/arrow-datafusion/issues/7302 https://github.com/apache/arrow-datafusion/issues/7304 https://github.com/apache/arrow-datafusion/issues/7305 https://github.com/apache/arrow-datafusion/issues/7306 https://github.com/apache/arrow-datafusion/issues/7307 https://github.com/apache/arrow-datafusion/issues/7308
Is your feature request related to a problem or challenge?
If we want to have DataFusion used as the core of many new systems, we need it to be as easy as possible for someone to get their idea working on top of DataFusion.
The current user guide I think helps setup the basics of the project and get a "hello world" style program going but then kind of leave the reader in a "now what" type situation: https://arrow.apache.org/datafusion/user-guide/example-usage.html
Describe the solution you'd like
I would like a document, perhaps similar in style to the polars user guide: https://pola-rs.github.io/polars-book/user-guide/
Basically I am thinking of something that would have helped @BubbaJoe get up to speed
The examples directory holds a bunch of examples: https://github.com/apache/arrow-datafusion/tree/main/datafusion-examples
Potential outline:
TableProviders
(in https://github.com/apache/arrow-datafusion/pull/7287)Describe alternatives you've considered
No response
Additional context
This idea was suggested by @MrPowers