Open raymondben opened 10 months ago
(Also, minor suggestion that I stumbled across while debugging this: you don't need https://github.com/pepijn-devries/CopernicusMarine/blob/master/R/cms_list_stac_files.r#L7. Just put a .data$
prefix on assets
in line 12.)
Hi @raymondben,
Thank you for the detailed report. This is a great help to improve the package. I will study your case and think about how to best handle the case where STAC responds with just the file, instead of a bucket. Your suggestions are really helpful
(Also, minor suggestion that I stumbled across while debugging this: you don't need https://github.com/pepijn-devries/CopernicusMarine/blob/master/R/cms_list_stac_files.r#L7. Just put a
.data$
prefix onassets
in line 12.)
This is also a good point. I think that assets <- NULL
is a relic from an earlier version where I didn't import rlang
's pronoun .data
. Your suggestion would make the code easier to read. I will update this.
I'll leave this issue open until I have decided on a definitive solution
Cheers,
Pepijn
Thanks for the great work with this package. Found a small problem for datasets that consist of only a single file. e.g. the MDT dataset:
It is happening because their API is returning the actual file, not its bucket, when you query the stac properties:
cms_list_stac_files
tries to issue a list-bucket request to this URL, which of course doesn't work.I have a workaround for my own needs, but it would be good to fix. I have not provided a PR because I don't know the best solution. You could perhaps detect the fact that the
href
ends with an actual filename and throw that part away. But reliably detecting filenames might not be straightforward. Known file extensions or perhaps even just paths that end with "." followed by two or three more characters, but either way seems like it would be fragile.I don't think you can rely on
href
having a predictable structure (e.g.https://host/*-native-*/native/DATASET_ID/LAYER/FILE
) because I am guessing that there could be additional subdirectories in between LAYER and FILE. (But if that's not the case, then this might work. Just throw away anything after the 7th element in https://github.com/pepijn-devries/CopernicusMarine/blob/master/R/cms_list_stac_files.r#L12).You definitely cannot rely on the actual URL in the
href
. For the example above, you can see that it's pointing to the filemdt_hybrid_cnes_cls18_cmems2020_global.nc
. But that file doesn't actually exist, and when you do a bucket-list query on the bucket, it turns out that the file is called something else. That seems like an error from Copernicus, but nonetheless I think you still have to go through the bucket-list step.