ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.35k stars 600 forks source link

docs: Is there any documentation on the code principles and overall architecture? #8148

Open stereoF opened 10 months ago

stereoF commented 10 months ago

Please describe the issue

Ibis has implemented something I've been particularly wanting to do since 2019, so I would like to learn how it's done through the source code. Perhaps in the future, I might even contribute a bit. However, Ibis seems to be a very large project now, and I don't know where to start. If there is a document about the principles and overall architecture, I can start from the points I'm interested in.

Code of Conduct

lostmygithubaccount commented 10 months ago

hi @stereoF, great to see interest in contributing! we would definitely appreciate it

I don't think we have a great writeup of the internals right now, but there is a concept doc here: https://ibis-project.org/concepts/internals

as you may or may not be aware, we're in the process of a major refactor of the internals in the-epic-split branch: https://github.com/ibis-project/ibis/tree/the-epic-split

once that is merged, I expect @kszucs to help write up a blog post on all the changes here and the internals. this PR description would also be good reading on getting up to speed on the internals: https://github.com/ibis-project/ibis/pull/7752

kszucs commented 10 months ago

I don't think we have a great writeup of the internals right now, but there is a concept doc here: https://ibis-project.org/concepts/internals

That is pretty outdated actually.

as you may or may not be aware, we're in the process of a major refactor of the internals in the-epic-split branch: https://github.com/ibis-project/ibis/tree/the-epic-split

That refactors only the relational operations and types, which are actually a smaller fraction of the internals.

once that is merged, I expect @kszucs to help write up a blog post on all the changes here and the internals. this PR description would also be good reading on getting up to speed on the internals: #7752

Well, we could have a series of blogpost about the internals, but we will still need a comprehensive guide to track the current codebase.