jazzband / tablib

Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
https://tablib.readthedocs.io/
MIT License
4.59k stars 590 forks source link

Dataset.subset interface badly explained, works only if headers are set #366

Open dheller1 opened 5 years ago

dheller1 commented 5 years ago

It took me quite a while to figure out how to use Dataset.subset.

The documentation states the interface is subset(rows=None, cols=None), so my first assumption was to just pass two ints for the number of requested rows and columns, and when that didn't work I passed a list of column indices, but also to no avail.

Only after debugging I finally found out that I first need to define column headers for the Dataset instance and then pass a subset of these headers to cols.

In my opinion, this could be made more clearly in the documentation. Also, is there a reason why headers are required and we cannot alternatively just pass column indices?

I would go ahead and try to implement that myself if you don't mind.

tribals commented 2 years ago

Still didn't understood how to use it...

I need just a slice of Dataset but returned as another Dataset instance.That's what "Pythonic" means. If you make Dateset to behave as, say, list, then do it all the way. So, I could for example take a subset of my data then convert it into another format. Returning list from slice is just ugly.