Open fmaussion opened 3 years ago
I think that this bug should be mentioned on the download website as a big warning: "Downloading data with RGI format may not yield all outlines available in GLIMS"
@bruceraup I think that users should be aware of this problem with a warning on the download portal until this is solved - when they choose "RGI format" as download option (the default), it is possible that GLIMS won't provide them with the full data request.
Trello: https://trello.com/c/L86Yv2wQ
Hi @fmaussion, I'm going to take a look at why this is happening. Do you have a glacier id that I can use for testing this behavior?
@betolink I'm on it, but in the meantime a good candidate to check would be the entity mentioned here: https://github.com/GLIMS-RGI/glims_issue_tracker/issues/4
As I mentioned in the corresponding trello card, as long as these errors remain in GLIMS, the "download as RGI" option cannot work properly - so in terms of priority I would recommend to sanitize GLIMS first before attempting to fix this.
@betolink other examples:
anlys_id
400185 (which are at least two entities)anlys_id
400542 (no clue what the problem with this one is)I count 88 missing entities in subm_id
624 alone.
Thanks Fabien, you're guess is that the misclassified orphan rocks are the ones causing the bug, correct? I'll take a look at these examples and the DB to understand what's going on. p.s. I don't have access to this Trello board.
you're guess is that the misclassified orphan rocks are the ones causing the bug, correct?
Partly. I think the bug is several bugs.
anlys_id
400185 is that the bug comes from the fact that we have a glacier within another glacier's rock outcrop. In my code, the trick was to first sort the rocks by area before attributing them to an outline to avoid these situations.anlys_id
400542, no clue what's going on.In all cases. any conversion script in GLIMS needs to have some safeguards after conversion, i.e. the total number of glac_bound
should be the same before and after conversion to the RGI format.
Updated link to trello: https://trello.com/c/BBNEjwMn
@betolink @bruceraup any news on this?
As I said a while ago, I think it would have been much fairer to GLIMS users to add a big WARNING to the download option because of this issue, as long as it is not resolved (which seems to be a larger undertaking).
Hi @fmaussion, you're right. The fixes are not that complicated but for some reason NSIDC gave priority to put GLIMS data under the NASA EDL system, I think we can put a warning on the website until we release the fixed version.
I fixed this yesterday. As noted on Trello, outlines are no longer dropped. However, glac_bound polygons coming from a multi-polygon, as in the example above, still have the same analysis IDs. This "problem" must be resolved farther upstream. I say "problem", in quotes, because it's not actually wrong -- just a different way of doing things. RGI could group these together as multi-polygons (which is how they were submitted to GLIMS), as another potential solution to the problem of non-one-to-one correspondence between RGI IDs and analysis IDs. That said, converting all multi-polygons in GLIMS to separate ones is definitely on the to-do list.
Thanks! I'll check it ASAP. I have continued the discussion related to this in the Trello card. As discussed there, I don't think a multipolygon solution in RGI is compatible with glacier attributes such as is_tidewater
or glacier length, etc.
All in the title - the number of glaciers should be the same regardless of the download type.