Open jstcki opened 4 years ago
Hey, thanks!
To me your suggestion makes sense. It will make a few things uglier, for instance:
.groupBy("raum")
-> .groupBy("https://ld.stadt-zuerich.ch/statistics/property/RAUM")
.filter(({ someDate }) => someDate.not.equals("2019-08-29T07:27:56.241Z"));
not possible anymore (no big deal though)I'll try something and we'll then discuss the details in a PR.
Note that querying for labels on all dimensions makes everything much slower, so I wonder if there would be a better way to do this. E.g. by only querying for labels in cube.dimensions()
and then stitching them together with a label-less result from cube.query()
. Haven't tried though.
Note that querying for labels on all dimensions makes everything much slower
Could you please tell us more about this? Would running datacube.components()
to fetch all labels be too costly?
I meant that currently, cube.select(allDimensions).query()
is much slower than cube.select([]).query()
because selecting dimensions queries for all dimension value labels on each observation.
This is probably related to #47 … adding labels to the query unfortunately makes it much slower.
BTW, we're currently also always setting all potential languages on the entrypoint, e.g. ["de", "fr", "it", "en", ""]
, because some datasets can be only available in one of these and it's not clear what the fallback should be. Does adding more languages make the query slower? This could probably be optimized if the datasets declared available languages correctly.
Yeah adding labels definitely makes things slower, and yes adding more languages makes it even slower.
I think not fetching labels for automatically selected dimensions and using dimensions IRIs as keys would solve most of the issue. Users could fetch dimensions and their labels independently and possibly cache them.
This could probably be optimized if the datasets declared available languages correctly.
@ktk what do you think about this, is it possible to declare the languages somewhere?
Hi!
I just noticed a discrepancy between how explicitly and automatically selected dimensions are handled, and another aspect which makes the automatic selects less-than-useful.
Point 2 could actually neatly be solved by not generating keys from the label but by using the dimension IRI. If behavior in point 1 would be consistent (i.e. labels present for auto-selects), this would actually remove the need to explicitly select dimensions at all.
For example:
If IRIs are used as keys, the argument to
.select()
could be simply an array of components or just their IRIs instead of having to specify binding names myself (which is also dangerous since these are not slugified!).