Arch finalization proposal

dchandan commented 1 year ago

Here is my proposal for finalizing the architecture of STAC populator. I have chosen to branch from arch-changes rather than overlay my changes to that branch, so once we finalize this PR we can merge it back into arch-changes and close PR #16.

There is currently only one implementation, for CMIP6 data residing on the UofT node. We expect more implementations to follow soon. Ouranos will probably want to have an implementation for CMIP6 data on the Ouranos node. I am also working on two other implemetations, one for the NEX-GDDP data and one that populates our catalog with data from other catalogs (this implemetation can probably be used verbatim by more than one Marble node.)

Key changes:

Extracting pydantic models into a separate file I moved the Pydantic models from stac_utils.py to a new models.py file.
Removing cmip6 extension After examination of the code I came to be of the opinion that the cmip6 extension was not adding much value and could be removed. Here's why:
- By the time the call to the extension was made from create_stac_item all appropriate CMIP6 properties were already added to the pystac.Item instance from the Pydantic data model that describes the CMIP6 Item Properties (CMIP6ItemProperties in add_CMIP6.py). Since the extension was simply adding properties to the Item that were already being added by the Pydantic model, in its form it was not adding any additional value.
- Creating an extension requires one to define a schema and point to the URI for that schema. This is an additional complication that may be be a barrier to those writing future ingestion scripts and so it would be best to avoid the use of this step in an example implementation.
- There was nothing else in the code that required the presence of this extension. Therefore, if the extension itself was not adding anything and it was not strictly necessary then it made sense to remove it.
- This allowed the removal of a lot of duplicate code.
Breaking up CFJsonItem I felt that the CFJsonItem was doing two things that didn't quite fit together within one class. On the one hand it contained logic for the creation of a new pystac.Item object from CF metadata and on the other hand there was logic that was specifically for use by the pystac.extensions.datacube extension. The former is applicable to all items that are created based on crawling the THREDDS catalog, but the datacube extension code is not applicable to all STAC items because the extension itself is not applicable to all types of data (or maybe one doesn't want to use the extension even for data to which it is applicable). Therefore, having the extension logic in that class that gets instantiated for every catalog item didn't make sense to me.

So, I dissolved CFJsonItem and extracted its logic as follows: code that pertains to the creation of STAC item (more specifically a pystac.Item class representing a STAC item) is in a new function in stac_utils.py called STAC_item_from_metadata while code meant to support the datacube extension is now in a new file extensions.py in the CMIP6-UofT implementations folder. I moved that code to the implementations folder rather than in the core section because I am unsure how generally applicable this code would be to other types of model output data, but this could be changed.

dchandan commented 1 year ago

To do: Datamodel extension wasn't working properly from the very start. We still have to get it to work.

fmigneault commented 1 year ago

@dchandan Not sure if you were already aware of this, but the is a STAC Extension for CF attributes: https://github.com/stac-extensions/cf. I believe this is the notation that CFJsonItem should use, not specifically applicable for CMIP6 or Datacube extensions, but any CF attribute used in NetCDF.

All code in implementations should be moved under STACpopulator so that they can be imported externally. Right now, they are not installed in the same module, which makes it unusable by other code base that could reuse the implementations (eg: https://github.com/crim-ca/ncml2stac/blob/main/notebooks/ncml2stac.ipynb). For this notebook, I am particularly interested in avoiding to duplicate the logic in CMIP6populator.create_stac_item.

huard commented 1 year ago

@dchandan I realize based on your comments that I may have not communicated properly the fact that the CMIP6 extension was meant to replace the Datamodel logic, not duplicate it. I just didn't remove yet the logic in add_CMIP6.py while I was still experimenting with extensions.

So extensions might be a bad idea, but not because it replicates existing code (points 2.1 and 2.4).

Also, I did not have to define a schema to get it to work, so not sure about this being an additional complication. If needed, pydantic can spit out the schema for us.

Basically, embedding the CMIP properties in an extension is just a way to formalize how we're adding CMIP properties into the STAC item, so that we're using the same mechanism to "extend" STAC items (datacube, cmip6, and eventually maybe other extensions).

dchandan commented 1 year ago

@huard Thanks for the clarification!

I think that the scheme URI is required. In your code it was specified in SCHEMA_URI and used later by the extension via the class method get_schema_uri. Since that specific URI does not work (I get a 404), I am inclined to think that there is no validation performed using the schema, or even a check to see if the schema exists, so perhaps in that sense maybe a working schema is not required. But I think, if one is using extensions then one should have a properly constructed schema.

You're right about Pydantic's ability to spit out a schema coming in handy for this task and thereby reducing the barrier. But if one has to create a pydantic model anyways to generate a schema then I don't see why not just use that model rather than create another extension and then call on the pydantic model to generate the schema for use in the extension. At least in this particular case I don't see the point since the extension would not be doing anything more logically sophisticated than what a pydantic model does. Maybe in some other case, for a different type of data, one might want an extension to incorporate additional logic. But that decision can, and in my opinion, should be case by case.

(As a side note, it is my opinion that the way PySTAC has implemented STAC extensions is very non-pythonic which makes working with extensions a very unsatisfactory experience.)

dchandan commented 1 year ago

I'm getting failures regarding the datacube extension.

Yeah, that extension still has to be debuged. It was not working for me from the start, and it is likely that a couple things might have broken during the rearrangement.

huard commented 1 year ago

I tend to agree with your assessment, but being unfamiliar with STAC, I thought it was more prudent to architect stuff using existing pathways. In practice, maybe it doesn't change anything, I'm just not confident I know the answer to this question.

In any case, there are a couple of points that could be streamlined in a future PR.

Serializing numpy types. I suspect this could probably de done somewhere in xncml.
Setting the serialization_alias for every property seems overly verbose. I think we could just export the dict and then modify the keys to add the prefix before ingestion into the STAC item.

dchandan commented 1 year ago

Serializing numpy types. I suspect this could probably de done somewhere in xncml.

I agree.

Setting the serialization_alias for every property seems overly verbose. I think we could just export the dict and then modify the keys to add the prefix before ingestion into the STAC item.

I have pushed a change with a new way in which the prefix is applied. Check it out and let me know your thoughts.

huard commented 1 year ago

Looks good !

dchandan commented 1 year ago

All typing dict or list should be more specific types with dict[<key>, <items>] and list[<item>] notation.

All changes done, except in the implementations/CMIP6-UofT/extensions.py file which can be done in another PR related to making the Datacube extension work.

dchandan commented 1 year ago

All code in implementations should be moved under STACpopulator so that they can be imported externally. Right now, they are not installed in the same module, which makes it unusable by other code base that could reuse the implementations (eg: https://github.com/crim-ca/ncml2stac/blob/main/notebooks/ncml2stac.ipynb). For this notebook, I am particularly interested in avoiding to duplicate the logic in CMIP6populator.create_stac_item.

Done

huard commented 1 year ago

Please take a look at https://github.com/crim-ca/stac-populator/pull/23/files

Discussions, back and forth interactions and tests with Nazim led to the creation of two links, one for collections and one for items.

dchandan commented 1 year ago

Please take a look at https://github.com/crim-ca/stac-populator/pull/23/files

Discussions, back and forth interactions and tests with Nazim led to the creation of two links, one for collections and one for items.

Thank you! I had forgotten about this other PR. I'll incorporate changes from there into here.

huard commented 1 year ago

I agree with Nazim, but I suggest deployment and auth be split into different PRs.

Nazim-crim commented 1 year ago

I agree, it was meant as a next steps comment following the merge of this PR.

dchandan commented 1 year ago

I agree that discussion on authentication should be taken up in a separate PR. But Nazim raises a timely issue that we'll have to discuss about together amongst ourselves very soon. One approach that we at Toronto have discussed amongst ourselves is to block all requests at the /stac endpoint except for GET at the proxy level to prevent anyone unauthorized from editing the catalog, and direct stac-populator to access the STAC App on an internal hostname/port that is not accessible from the outside network. This approach would not necessitate the need to tie stac-populator to magpie and not clutter the code with authentication related logic. But we haven't thought it through to see if this is the best way to proceed, so it's just a prelim proposal. I can open an issue later today where we can discuss authentication.

Nazim-crim commented 1 year ago

to access the STAC App on an internal hostname/port that is not accessible from the outside network.

for this pr https://github.com/bird-house/birdhouse-deploy/pull/386 to work, request needs to pass through the proxy. Without that the hook won't work. But yes we can discuss it later.

dchandan commented 1 year ago

I think it would be best if we don't need to pass $(IMP_DIR)/CMIP6-UofT/CMIP6.yml in the python call. Since add_CMIP6.py is already in the same directory it should just use the .yml in its directory.

Done :)

dchandan commented 1 year ago

If no one has an objection, I will merge the PR.

fmigneault commented 1 year ago

If no one has an objection, I will merge the PR.

Give me a few minutes. I'm testing the new classes in my notebook.

fmigneault commented 1 year ago

@dchandan I think the implementations subdirs are missing __init__.py files. They are not included in the site-packages dirs after installing it from this branch.

dchandan commented 1 year ago

@dchandan I think the implementations subdirs are missing __init__.py files. They are not included in the site-packages dirs after installing it from this branch.

Right.... Done.

fmigneault commented 1 year ago

Needs one in implementations itself as well.

fmigneault commented 1 year ago

Should also exclude tests from the install. Either by explicitly listing the exclusion in pyproject or by moving it up outside STACPopulator.

fmigneault commented 1 year ago

I don't think python package / setuptools resolution like the names with hyphens and mixed lower/upper characters under implementations.

fmigneault commented 1 year ago

This line: https://github.com/crim-ca/stac-populator/blob/c62fb801439b2d4900550e157df431984122654e/STACpopulator/implementations/CMIP6-UofT/add_CMIP6.py#L9 causes:

Cell In[11], line 13
     10 import xncml
     11 from pydantic.networks import Url
---> 13 from STACpopulator.implementations.CMIP6_UofT.add_CMIP6 import CMIP6populator

File ~/dev/daccs/ncml2stac/src/stacpopulator/STACpopulator/implementations/CMIP6_UofT/add_CMIP6.py:9
      7 import pyessv
      8 from colorlog import ColoredFormatter
----> 9 from extensions import DataCubeHelper
     10 from pydantic import AnyHttpUrl, ConfigDict, Field, FieldValidationInfo, field_validator
     12 from STACpopulator import STACpopulatorBase

ModuleNotFoundError: No module named 'extensions'

When installed, the extension is looked up in the root STACpopulator instead of nested under the specific implementation. It must be replaced by from STACpopulator.implementations.CMIP6_UofT.extensions import DataCubeHelper

I renamed CMIP6-UofT -> CMIP6_UofT to make other imports work.

dchandan commented 1 year ago

Thanks for catching these @fmigneault! Hopefully all those issues are fixed now. Please confirm.

fmigneault commented 1 year ago

I've been trying to make the notebook work with latest changes. So far, no luck.

I basically need to hack my way around the CMIP6populator, THREDDSLoader and its underlying TDSCatalog implementations since I'm trying to only convert a single NCML, not crawl the entire catalog. All of those automatically perform a lot of unnecessary __iters__ right from the moment __init__ are done. The way that NCML data is extracted from XML response, converted to a siphon.catalog.Dataset, and passed down from THREDDSLoader.extract_metadata to CMIP6populator.create_stac_item are too thightly coupled. Although they are distinct classes, there is no easy way to swap loader implementations that attempt anything different.

The original implementation from the previous branch, although it replicated a few functions to shuffle some metadata , was much less convoluted.

The STACpopulatorBase class also does some checks of YAML configs that are irrelevant for my use cases as well. I have to mock files just to avoid os.path.isfile checks, or rewrite CMIP6populator.__init__ entirely.

Still not working attempt: https://github.com/crim-ca/ncml2stac/commit/9da080bbe912dc54654e5bab9d2b29880f8d027b

dchandan commented 1 year ago

I'm trying to only convert a single NCML, not crawl the entire catalog.

I wonder if you are trying to do something that this piece of code was never meant to do.... I am usually not in favour of tightly coupled apps myself; I prefer to create frameworks that allow one to write solutions for different use cases. I think this does that for the purpose of creating STAC catalogs as it facilitates the main aspects of that workflow: (i) iterating over some input, (ii) producing an appropriate representation of STAC, and (iii) posting that representation to a server. Processing a single file was never the purpose, though I am absolutely not opposed to exploring if this additional use case can be added easily without much reworking (something I do not want to do at this point).

I am curious which branch you found more helpful in this regard. I don't think this structure has changed in a meaningful way through the development.

The STACpopulatorBase class also does some checks of YAML configs that are irrelevant for my use cases as well. I have to mock files just to avoid os.path.isfile checks, or rewrite CMIP6populator.init entirely.

Right, but again, this is there to serve a need directly associated with the purpose of this code. There needed to be a reasonable way to pass the various pieces of information about a new collection/catalog to the code. The hardcoded config file helps alleviate the requirement to pass that information from the command line, and was suggested by Nazim above. And I think it helps.

fmigneault commented 1 year ago

Using branch https://github.com/Ouranosinc/stac-populator/tree/collection_link, the steps were very simple. I left the relevant code for that branch commented in the noteobok for comparison.

The steps used to generate the STAC Item were simply:

ds = xncml.Dataset(input_ncml)
input_ncml_data = ds.to_cf_dict()
stac_item_id = make_cmip6_item_id(attrs["attributes"])
attrs["id"] = stac_item_id
stac_item = CFJsonItemNetCDF(stac_item_id, attrs, cmip6.Properties)
DatacubeExt(stac_item)
c_item_data = stac_item.item.to_dict()

CFJsonItemNetCDF essentially only added the rel: source link on top of what CFJsonItem already did.

The only way this could have been improved even more, was if CMIP6populator.create_stac_item (or a similar method) did all these steps directly instead of starting with input_ncml_data as argument. Basically, I would like to have xncml.Dataset.to_cf_dict() abstracted away from the notebook as well, but worst case now would be to do:

ds = xncml.Dataset(input_ncml)
input_ncml_data = ds.to_cf_dict()
CMIP6populator.create_stac_item("name?", input_ncml_data)

Basically, what would be needed is to expose methods that allows to directly do: THREDDSLoader.__iter__ THREDDSLoader.extract_metadata CMIP6populator.ingest CMIP6populator.create_stac_item

But right now, the intermediate objects needed for each step are obtained from __iter__ yield of underlying catalog parsers. To patch the objects and skip __iter__ that assign each object's attributes, one basically has to reimplement the whole processing chain.

fmigneault commented 1 year ago

@dchandan I now have a somewhat working version: https://github.com/crim-ca/ncml2stac/compare/main...5b3a3baabb41901b47b53d3e80e30479816ecb9c I've added comments that hopefully illustrate the kind of overrides I am refering to.

I see that stac_extensions is still empty. Also, DatacubeExt is not applied anymore (no cube:variables and cube:dimensions properties). See the last cell output diff.

Exception raised by datacube:

  WARNING: [STACpopulator.implementations.CMIP6_UofT.add_CMIP6] Failed to add Datacube extension to item sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc
Traceback (most recent call last):
  File "/home/francis/dev/daccs/ncml2stac/src/stacpopulator/STACpopulator/implementations/CMIP6_UofT/add_CMIP6.py", line 157, in create_stac_item
    dc_ext = DatacubeExtension.ext(item, add_if_missing=True)
NameError: name 'DatacubeExtension' is not defined

fmigneault commented 1 year ago

Following are some changes that would greatly simplify the definition of a custom populator without hackish overrides:

Allow derived populators to pass down arguments to their data_loader. For example, depth would be useful here. Could simply be a **data_loader_kwargs to allow implementation to evolve easily with more parameters. https://github.com/crim-ca/stac-populator/blob/10db1281b46298828727293159e7a5a52e71cb89/STACpopulator/implementations/CMIP6_UofT/add_CMIP6.py#L125-L135
Move those lines into a separate method. Something like get_collection_metadata. This way, an implementation that whiches to return the title, description, etc. without a file could do so easily by overriding it. https://github.com/crim-ca/stac-populator/blob/10db1281b46298828727293159e7a5a52e71cb89/STACpopulator/populator_base.py#L47-L64
Do not perform these steps in __init__, do them at the start of ingest. This way, a populator implementation that does not need these definitions (or what alternative ones) can override the relevant operations and have them called only when needed, not during object instantiation, https://github.com/crim-ca/stac-populator/blob/10db1281b46298828727293159e7a5a52e71cb89/STACpopulator/populator_base.py#L66-L69
Require that GenericLoader and their derived classes to implement __getitem__, not only __iter__, to obtain a specific item by item_name directly. In THREDDSLoader, that would be equivalent to doing self.catalog_head.datasets.get(). A more fancy implementation could also accept a list of item_name as argument, which would do self.catalog_head.datasets.get() iteratively going deeper using: https://github.com/crim-ca/stac-populator/blob/10db1281b46298828727293159e7a5a52e71cb89/STACpopulator/input.py#L84-L88 ... but instead of iterating on yield'd items, it would access them directly. That would make a huge difference if there are indeed thousands of NetCDF files as you mentionned in the STAC hook PR.

Assuming those are implemented, the sample notebook could simply do the following:

ncml_url = "https://svc.com/thredds/ncml/some/nested/netcdf.nc"
ncml_name = os.path.split(ncml_url)[-1]
catalog_url = "https://svc.com/thredds/catalog/some/nested/catalog.xml"
stac_host = "https://svc.com/stac"
cmip6_pop = CMIP6Populator(stac_host, catalog_url, depth=1)
ncml_data = cmip6_pop.get(ncml_name)
stac_item = cmip6_pop.create_stac_item(ncml_name, ncml_data)

That would make a very clean solution, and would make the populator flexible for collection vs single-item handling.

Note that handling single-item is necessary, for example, to update only specific items that we know have problematic metadata, without having to regenerate the whole catalog.

dchandan commented 1 year ago

The steps used to generate the STAC Item were simply:

ds = xncml.Dataset(input_ncml)
input_ncml_data = ds.to_cf_dict()
stac_item_id = make_cmip6_item_id(attrs["attributes"])
attrs["id"] = stac_item_id
stac_item = CFJsonItemNetCDF(stac_item_id, attrs, cmip6.Properties)
DatacubeExt(stac_item)
c_item_data = stac_item.item.to_dict()

Well you can still so something nearly identical:

ds = xncml.Dataset(input_ncml)
input_ncml_data = ds.to_cf_dict()
stac_item_id = make_cmip6_item_id(attrs["attributes"])
attrs["id"] = stac_item_id
item = STAC_item_from_metadata(stac_item_id, stac_item_id, CMIP6ItemProperties, GeoJSONPolygon)
# some appropriate Datacube stuff

dchandan commented 1 year ago

@dchandan I now have a somewhat working version: crim-ca/ncml2stac@main...5b3a3ba I've added comments that hopefully illustrate the kind of overrides I am refering to.

I see that stac_extensions is still empty. Also, DatacubeExt is not applied anymore (no cube:variables and cube:dimensions properties). See the last cell output diff.

Exception raised by datacube:
  WARNING: [STACpopulator.implementations.CMIP6_UofT.add_CMIP6] Failed to add Datacube extension to item sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc
Traceback (most recent call last):
  File "/home/francis/dev/daccs/ncml2stac/src/stacpopulator/STACpopulator/implementations/CMIP6_UofT/add_CMIP6.py", line 157, in create_stac_item
    dc_ext = DatacubeExtension.ext(item, add_if_missing=True)
NameError: name 'DatacubeExtension' is not defined

Yes, this is a work in progress. It was mentioned earlier in the PR.

fmigneault commented 1 year ago

Well you can still so something nearly identical:

Not exactly. The links are not generated the same way. This is why I was waiting on this implementation. See how the service names are defined in the diff of notebook output.

Also, I would rather have all the parsing logic in the stac-populator, this way we can be sure the result is identical on both ends. It is preferable to have 1 reference implementation on converting NCML to STAC. The notebook is only to demonstrate how to make a deployable CWL for Weaer directly from a notebook on Github. It would preferably not have any custom logic related to the NCML to STAC convertion. This is just the use case I happened to pick to demonstrate the feature.

dchandan commented 1 year ago

Also, I would rather have all the parsing logic in the stac-populator, this way we can be sure the result is identical on both ends.

I don't know what both ends means. I don't follow your use case. The NCML to STAC conversion is a workflow that happens on the server side initiated by the admin to populate the STAC database. That code here does that job properly (or are you saying that there is a deficiency in the STAC representation it creates, which would be useful to know).

We can chat about what exactly you are trying to do tomorrow over a zoom call.

dchandan commented 1 year ago

I think I've figured out what's going on. The issue is with the idiosyncrasies of how the THREDDS server returns metadata based on how the metadata is queried. We can discuss this on the call today.

dchandan commented 1 year ago

@fmigneault I've made changes to allow the kinds of workflow we discussed last Friday. Please check it out. The creation of a single STAC item can be tested using tests/test_standalone_stac_item.py.

dchandan commented 1 year ago

You had commented earlier that:

The steps used to generate the STAC Item were simply:

ds = xncml.Dataset(input_ncml)
input_ncml_data = ds.to_cf_dict()
stac_item_id = make_cmip6_item_id(attrs["attributes"])
attrs["id"] = stac_item_id
stac_item = CFJsonItemNetCDF(stac_item_id, attrs, cmip6.Properties)

This does exactly that. I am not following the objection now....

fmigneault commented 1 year ago

This does exactly that. I am not following the objection now....

Indeed. But as we discussed in our 1-on-1 call, I would rather have a single operation without any specific logic related to generating the attributes from the NCML and for the STAC Item. What the notebook did before was not "better", just easier with less overrides, but it was still a temporary implementation until the logic is entirely self-contained by stac-populator.

dchandan commented 1 year ago

This does exactly that. I am not following the objection now....

Indeed. But as we discussed in our 1-on-1 call, I would rather have a single operation without any specific logic related to generating the attributes from the NCML and for the STAC Item. What the notebook did before was not "better", just easier with less overrides, but it was still a temporary implementation until the logic is entirely self-contained by stac-populator.

Okay. I will leave it for a later refactoring of code. Right now, I need to get this thing working and deploy it for our use.

dchandan commented 1 year ago

@fmigneault Any idea why this "https://psl.noaa.gov/thredds/ncml/Datasets/NARR/Dailies/pressure/air.197901.nc?catalog=http://psl.noaa.gov/thredds/catalog/Datasets/NARR/Dailies/pressure/catalog.html&dataset=Datasets/NARR/Dailies/pressure/air.197901.nc" catalog does not contain the same information as "https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/ncml/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc?catalog=https%3A%2F%2Fpavics.ouranos.ca%2Ftwitcher%2Fows%2Fproxy%2Fthredds%2Fcatalog%2Fbirdhouse%2Ftestdata%2Fxclim%2Fcmip6%2Fcatalog.html&dataset=birdhouse%2Ftestdata%2Fxclim%2Fcmip6%2Fsic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"?

Both have query parameters, but they behave differently. While the Ouranos one gives service URLs, the other one does not.

fmigneault commented 1 year ago

Maybe it is the versions? https://psl.noaa.gov/thredds/catalog/catalog.xml reports version="1.2" https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/catalog.xml reportsversion="1.0.1"`

Otherwise, maybe the contents of the NetCDF themselves differ. I am not familiar enough with the metdata those files contain. @huard ?

dchandan commented 1 year ago

Maybe it is a version thing.

That complicates things. I can't use the query parameter based link anymore because different servers may behave differently. I will revert back to using access links by opening the catalog itself using siphon. My tests show that way works with both of these servers.

fmigneault commented 1 year ago

Check if you can get the same results by overriding siphon.catalog.TDSCatalog._process_dataset to a no-op method, or something that limits how much crawling it does on __init__.

If the THREDDS server has a lot of datasets, it starts parsing everything right away, which is quickly inefficient if the purpose is only to describe a single NCML file or a subset of a specific dataset. I tried with siphon.catalog.TDSCatalog("psl.noaa.gov/thredds/catalog/catalog.xml") and it takes some time to load 3759 datasets...

Even using siphon.catalog.TDSCatalog, I think you will need the query parameter because the URL of the NCML itself ("pavics.ouranos.ca/twitcher/ows/proxy/thredds/ncml/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc) does not resolve the reference of its parent dataset.

huard commented 1 year ago

So I don't understand why there are differences with and without query parameters. I found out empirically by inspecting the URL of the NcML service exposed in the catalog web page, and have been using it since because the response contains more information.

dchandan commented 1 year ago

Closed as this was getting too long. Will revisit outstanding issues in future PRs.

crim-ca / stac-populator

Arch finalization proposal #25

Key changes: