Open osmuser63783 opened 9 months ago
I agree, it would be nice to be able to call len(features)
in addition to (or instead of) features.count
.
The reason __len__
is not implemented is related to the way the Python list
constructor works. As an optimization, it attempts to pre-allocate the list's backing array with an exact size if the source collection implements __len__
.
However, this would cause the query to execute twice: once to get its length (i.e. count
), then again to retrieve the features and populate the list. This is, in fact, what your code sample does -- it calls count
to size the progress bar, then performs the query again in the for
loop. In most cases, queries execute so quickly that the performance impact is negligible, but large/complex queries could potentially take minutes.
So it's a compromise that let's the user decide if/when to incur the cost of the extra query run.
Ideally, if PyObject_LengthHint()
could try the object's __length_hint__
function before __len__
, the Query Engine could then decide whether to run the query or just provide a cheap estimate.
(This is a very minor feature request)
I was wondering if there's a reason that the number of features in a feature set is provided by
count
instead of__len__
.It's more intuitive to call
len(features)
thanfeatures.count
.Also, I am working interactively (in a Jupyter notebook) and when I iterate over feature sets I use
tqdm
to show me a progress bar. Because feature set length is provided bycount
instead of__len__()
I have to type:for h in tqdm(planet("w[highway]"), total = planet("w[highway]").count)
instead of justfor h in tqdm(planet("w[highway]")
for every loop :-)And yes, I am very lazy :-D