Open d33bs opened 4 months ago
Good catch. I'll try to fix it.
After a quick look, it seems that the examples for Table
and RecordBatch
are being used redundantly. This is because a Table
is made up of multiple RecordBatch
objects, so they work with the same logic internally, and the documentation is generated through docstrings.
From a user perspective, without understanding the concepts (What is a Table
? What is a RecordBatch
?), or when copying and pasting code to use with other code, it might lead to unintended results.
On the other hand, from a development and documentation management perspective, the current form doesn’t seem incorrect.
What are your thoughts on this? cc @AlenkaF
These docstrings were united between Table
and RecordBatch
when the parent class _Tabular
was added to consolidate Table
and RecordBatch
functionality (see https://github.com/apache/arrow/issues/36129). In these consolidated methods docstrings there is a text added:
Table (works similarly for RecordBatch)
that should help with understanding why Table
example is used for RecordBatch
also. If you look in the RecordBatch
class examples, you have the "correct" class usage.
For me this is quite clear but can imagine from a user's perspective it might be confusing. If you have a good proposal for clearer content in the consolidated methods examples, I am definitely open for improvements.
Thanks @llama90 and @AlenkaF for looking into this! Perhaps this could benefit from Sphinx conditionals (or something similar, if Sphinx isn't used). I'm not certain if these would work within the context of docstrings and any other special configurations which may be at hand. Thoughts?
We use Sphinx but I do not think Sphinx conditionals will work for code docstrings.
Describe the enhancement requested
Thanks for the fantastic work on Arrow and PyArrow! I noticed a minor typo in the docs for code examples with
RecordBatch.from_pydict()
andRecordBatch.from_pylist()
which mentionpa.Table.from_{x}
instead ofpa.RecordBatch.from_{x}
. This may result in possible user confusion, needing to look up the source code, or try things out to make sure the Python API can handle the calls.For example, under
pa.RecordBatch.from_pydict()
:Currently shows:
Should be:
The same stands for examples in
pa.RecordBatch.from_pylist()
.Thanks in advance for help with correcting this.
Component(s)
Documentation, Python