Closed mishaschwartz closed 3 weeks ago
Oh wait, can you do the same changes for optional-components/testthredds as well?
@tlvu
Oh wait, can you do the same changes for optional-components/testthredds as well?
Isn't this set up so that we can run tests against a different THREDDS server? If the tests don't require a different configuration why do we need to change this as well?
Isn't this set up so that we can run tests against a different THREDDS server?
To test different version yes.
If the tests don't require a different configuration why do we need to change this as well?
I meant to allow the same customizations for the test. Currently we are testing Thredds v5 using this testthredds on our production host. By the same token we could at the same time test additional configs. So the same customizations would be useful.
So I meant to add TESTTHREDDS_SERVICE_DATA_EXTRA_FILE_FILTERS
and TESTTHREDDS_DATASET_DATASETSCAN_BODY
.
@fmigneault
The THREDDS definitions need to align with Magpie definitions to protect the contents accordingly.
What was the intention for how the Magpie permissions were supposed to interact with additional catalogs introduced by setting the THREDDS_ADDITIONAL_CATALOG
variable?
Is there a solution in place for that already or do I have to come up with a solution that accounts for arbitrary catalog definitions as well?
@mishaschwartz
For Magpie permissions, it does not really care about how the catalogs (the datasetScan
blocks) are defined. It works only with the resolved URL paths. The "catalog
" is the default "browsing" service
that THREDDS uses when navigating its hierarchy, and the URL are all formed as /thredds/{service}/{nested-dirs...}
. So, on Magpie side, the resources are defined as {thredds-service-type}/{nested-dirs-resources...}/{file-resource}
(see the {service}
is omitted). Adding more catalogs would only mean to reflect these new directories how they are resolved by URL under the Magpie THREDDS service.
The service
combinations are defined directly on the Magpie THREDDS service configuration. Depending on the "prefixes
" (i.e.: the service
), it handles GET
requests either as browse
or read
permissions. All service
and file extensions that are classified as providing metadata are typically browse
, and the actual data access are read
. In some edge cases, some services can be both (eg: WMS and WCS that can describe or get the data based on request
parameter), and are therefore placed in the read
category.
@fmigneault
Thanks for the detailed explanation. I understand the Magpie configuration and how its "prefix" definitions relate to the URLs in THREDDS.
My concern is more about whether we need to be able to customize the "file_patterns" definitions in the Magpie configuration files to handle duplicate file extensions other than .nc and .ncml
The other concern is that users can define custom service definitions if they'd like other than the ones listed here:
Or they could potentially modify the base
attribute so that the URL path no longer matches the prefix defined in Magpie.
I propose we either do:
I'm working on a solution but if you have any insight into this issue let me know
If custom service types are added, they must be provided in the browse/read section accordingly for Magpie to grant/deny access to them as expected. Similarly, additional file patterns (or extensions) must also be provided.
If another location than /twitcher/ows/proxy/thredds/{service}/
is employed, it will not be managed by Magpie/Twitcher. However, if the service uses the same /twitcher/ows/proxy/thredds/
prefix (it should since it is under this THREDDS docker service), it will default to DENY access, unless "full-access" was granted on THREDDS service to an anonymous group. Therefore, I don't think modifying base
is the right approach. (BTW, they should probably also inherit from TWITCHER_PROTECTED_PATH
rather than hard-coded)
Rather than having THREDDS_SERVICE_DATA_EXTRA_FILE_FILTERS
directly with the XML, maybe we should have a THREDDS_SERVICE_DATA_EXTRA_FILE_BROWSE_EXTENSIONS
and THREDDS_SERVICE_DATA_EXTRA_FILE_READ_EXTENSIONS
, and generate THREDDS_SERVICE_DATA_EXTRA_FILE_FILTERS
from them? There are actually already some discrepancies (eg: missing .md
, .rst
, .csv
in Magpie, but listed in THREDDS). A few warning/descriptions would help explain how to customize them.
After several iterations, I don't think that there is an easy way to get the flexibility we want by defining these variables and also enforce the Magpie settings as well. So the compromise I went with is to add some defaults and make the Magpie settings configurable as well so that they can be updated as needed. I also added some instructions/warnings about how to configure Magpie to match changes to THREDDS.
Build URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2845/
Result :white_check_mark: SUCCESS
BIRDHOUSE_DEPLOY_BRANCH : thredds-more-configuration
DACCS_IAC_BRANCH : master
DACCS_CONFIGS_BRANCH : master
PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master
PAVICS_SDI_BRANCH : master
DESTROY_INFRA_ON_EXIT : true
PAVICS_HOST : https://host-140-216.rdext.crim.ca
Tests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1716/
[2024-10-15T15:04:46.597Z] ============================= test session starts ==============================
[2024-10-15T15:04:46.597Z] platform linux -- Python 3.11.6, pytest-8.2.0, pluggy-1.5.0
[2024-10-15T15:04:46.597Z] rootdir: /home/jenkins/agent/workspace/PAVICS-e2e-workflow-tests_master
[2024-10-15T15:04:46.597Z] plugins: anyio-4.3.0, dash-2.17.0, nbval-0.11.0, tornasync-0.6.0.post2, xdist-3.5.0
[2024-10-15T15:04:46.597Z] collected 301 items
[2024-10-15T15:04:46.597Z]
[2024-10-15T15:04:56.278Z] notebooks-auth/geoserver.ipynb .................. [ 5%]
[2024-10-15T15:06:12.447Z] notebooks-auth/test_cowbird_jupyter.ipynb .......... [ 9%]
[2024-10-15T15:06:12.709Z] notebooks-auth/test_thredds.ipynb ........... [ 12%]
[2024-10-15T15:06:59.509Z] pavics-sdi-master/docs/source/notebooks/CaSR_basic.ipynb ...... [ 14%]
[2024-10-15T15:07:09.383Z] pavics-sdi-master/docs/source/notebooks/WCS_example.ipynb ....... [ 17%]
[2024-10-15T15:07:18.554Z] pavics-sdi-master/docs/source/notebooks/WFS_example.ipynb ...... [ 19%]
[2024-10-15T15:14:37.025Z] pavics-sdi-master/docs/source/notebooks/climex.ipynb ............ [ 23%]
[2024-10-15T15:14:37.025Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-climate-stations.ipynb . [ 23%]
[2024-10-15T15:14:43.469Z] ............... [ 28%]
[2024-10-15T15:14:51.365Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-xclim.ipynb ..... [ 30%]
[2024-10-15T15:14:58.357Z] pavics-sdi-master/docs/source/notebooks/esgf-dap.ipynb ....... [ 32%]
[2024-10-15T15:15:13.478Z] pavics-sdi-master/docs/source/notebooks/forecasts.ipynb ...... [ 34%]
[2024-10-15T15:15:37.282Z] pavics-sdi-master/docs/source/notebooks/opendap.ipynb ....... [ 36%]
[2024-10-15T15:15:41.808Z] pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb ..... [ 38%]
[2024-10-15T15:20:10.909Z] pavics-sdi-master/docs/source/notebooks/regridding.ipynb ............... [ 43%]
[2024-10-15T15:21:20.639Z] ............. [ 47%]
[2024-10-15T15:21:22.576Z] pavics-sdi-master/docs/source/notebooks/rendering.ipynb .... [ 49%]
[2024-10-15T15:21:24.359Z] pavics-sdi-master/docs/source/notebooks/subset-user-input.ipynb ........ [ 51%]
[2024-10-15T15:21:39.983Z] ................. [ 57%]
[2024-10-15T15:21:47.753Z] pavics-sdi-master/docs/source/notebooks/subsetting.ipynb ...... [ 59%]
[2024-10-15T15:21:49.137Z] pavics-sdi-master/docs/source/notebook-components/weaver_example.ipynb . [ 59%]
[2024-10-15T15:22:12.317Z] ......... [ 62%]
[2024-10-15T15:22:21.423Z] finch-master/docs/source/notebooks/dap_subset.ipynb ........... [ 66%]
[2024-10-15T15:22:30.278Z] finch-master/docs/source/notebooks/finch-usage.ipynb ...... [ 68%]
[2024-10-15T15:22:31.672Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb . [ 68%]
[2024-10-15T15:22:34.736Z] ..... [ 70%]
[2024-10-15T15:22:49.854Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-2Subsetting.ipynb . [ 70%]
[2024-10-15T15:23:06.031Z] ............ [ 74%]
[2024-10-15T15:23:20.940Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-3Climate-Indicators.ipynb . [ 75%]
[2024-10-15T15:23:43.719Z] .....s. [ 77%]
[2024-10-15T15:23:51.883Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-4Ensembles.ipynb . [ 77%]
[2024-10-15T15:24:08.192Z] .. [ 78%]
[2024-10-15T15:24:16.345Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-5Visualization.ipynb . [ 78%]
[2024-10-15T15:25:14.907Z] ......... [ 81%]
[2024-10-15T15:25:24.916Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-6Regridding_Conversion.ipynb . [ 82%]
[2024-10-15T15:30:01.878Z] .... [ 83%]
[2024-10-15T15:30:01.878Z] PAVICS-landing-master/content/notebooks/hydrology/PAVICStutorial_Hydrology-01_Intro.ipynb . [ 83%]
[2024-10-15T15:30:01.878Z] .... [ 85%]
[2024-10-15T15:30:05.180Z] PAVICS-landing-master/content/notebooks/hydrology/PAVICStutorial_Hydrology-02_Calibration.ipynb . [ 85%]
[2024-10-15T15:30:11.022Z] ..... [ 87%]
[2024-10-15T15:30:16.314Z] PAVICS-landing-master/content/notebooks/hydrology/PAVICStutorial_Hydrology-03_Watershed_properties.ipynb . [ 87%]
[2024-10-15T15:30:33.895Z] ............. [ 91%]
[2024-10-15T15:30:38.115Z] PAVICS-landing-master/content/notebooks/hydrology/PAVICStutorial_Hydrology-04_Time_series_analysis.ipynb . [ 92%]
[2024-10-15T15:30:39.499Z] ...... [ 94%]
[2024-10-15T15:30:41.790Z] notebooks/hummingbird.ipynb ............ [ 98%]
[2024-10-15T15:33:15.937Z] notebooks/stress-tests.ipynb ...... [100%]
[2024-10-15T15:33:15.937Z]
[2024-10-15T15:33:15.937Z] =============================== warnings summary ===============================
I think the configuration is overcomplicated
I think I agree that this has gotten out of hand.
The main issue is that I don't want to make it possible to break the service catalog which would break other things internally for the rest of the components in the stack. But I still don't fully understand how that is used...
For example, if I wanted to configure THREDDS with only fileServer for .nc and .txt
I think that your set up here is overly complicated actually (which highlights your point). I don't think you'd need to set THREDDS_MAGPIE_EXTRA_DATA_PREFIXES
and I don't think we ever want people to modify THREDDS_DEFAULT_FILE_FILTERS
(it's not provided in env.local.example as a settable option).
Is there really any advantage of having duplicate sets...
I think I agree with this. But the defaults vs. the extras were requested in the discussion here https://github.com/bird-house/birdhouse-deploy/pull/472#discussion_r1790748137 Do you no longer think that's a concern?
I think it might be worthwhile to remove the defaults vs extras duplication to make things easier for users in general. If one wants to preserve the defaults, it is easy to copy-paste its value and add the "extra" that is desired within a single variable.
Build URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2847/
Result :x: FAILURE
BIRDHOUSE_DEPLOY_BRANCH : thredds-more-configuration
DACCS_IAC_BRANCH : master
DACCS_CONFIGS_BRANCH : master
PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master
PAVICS_SDI_BRANCH : master
DESTROY_INFRA_ON_EXIT : true
PAVICS_HOST : https://host-140-216.rdext.crim.ca
:warning: Infrastructure deployment failed. :warning:
Instance destroyed due to CI execution.
To debug, launch an instance manually with PR reference
thredds-more-configuration
.
Build URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2848/
Result :white_check_mark: SUCCESS
BIRDHOUSE_DEPLOY_BRANCH : thredds-more-configuration
DACCS_IAC_BRANCH : master
DACCS_CONFIGS_BRANCH : master
PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master
PAVICS_SDI_BRANCH : master
DESTROY_INFRA_ON_EXIT : true
PAVICS_HOST : https://host-140-216.rdext.crim.ca
Tests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1719/
[2024-10-16T18:01:20.587Z] ============================= test session starts ==============================
[2024-10-16T18:01:20.587Z] platform linux -- Python 3.11.6, pytest-8.2.0, pluggy-1.5.0
[2024-10-16T18:01:20.587Z] rootdir: /home/jenkins/agent/workspace/PAVICS-e2e-workflow-tests_master
[2024-10-16T18:01:20.587Z] plugins: anyio-4.3.0, dash-2.17.0, nbval-0.11.0, tornasync-0.6.0.post2, xdist-3.5.0
[2024-10-16T18:01:20.587Z] collected 301 items
[2024-10-16T18:01:20.587Z]
[2024-10-16T18:01:29.949Z] notebooks-auth/geoserver.ipynb .................. [ 5%]
[2024-10-16T18:02:34.894Z] notebooks-auth/test_cowbird_jupyter.ipynb .......... [ 9%]
[2024-10-16T18:02:40.584Z] notebooks-auth/test_thredds.ipynb ........... [ 12%]
[2024-10-16T18:03:27.629Z] pavics-sdi-master/docs/source/notebooks/CaSR_basic.ipynb ...... [ 14%]
[2024-10-16T18:03:36.576Z] pavics-sdi-master/docs/source/notebooks/WCS_example.ipynb ....... [ 17%]
[2024-10-16T18:03:46.309Z] pavics-sdi-master/docs/source/notebooks/WFS_example.ipynb ...... [ 19%]
[2024-10-16T18:11:22.227Z] pavics-sdi-master/docs/source/notebooks/climex.ipynb ............ [ 23%]
[2024-10-16T18:11:22.227Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-climate-stations.ipynb . [ 23%]
[2024-10-16T18:11:27.795Z] ............... [ 28%]
[2024-10-16T18:11:35.513Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-xclim.ipynb ..... [ 30%]
[2024-10-16T18:11:42.251Z] pavics-sdi-master/docs/source/notebooks/esgf-dap.ipynb ....... [ 32%]
[2024-10-16T18:11:57.194Z] pavics-sdi-master/docs/source/notebooks/forecasts.ipynb ...... [ 34%]
[2024-10-16T18:12:02.965Z] pavics-sdi-master/docs/source/notebooks/opendap.ipynb ....... [ 36%]
[2024-10-16T18:12:07.409Z] pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb ..... [ 38%]
[2024-10-16T18:15:29.558Z] pavics-sdi-master/docs/source/notebooks/regridding.ipynb ............... [ 43%]
[2024-10-16T18:16:27.634Z] ............. [ 47%]
[2024-10-16T18:16:32.092Z] pavics-sdi-master/docs/source/notebooks/rendering.ipynb .... [ 49%]
[2024-10-16T18:16:34.018Z] pavics-sdi-master/docs/source/notebooks/subset-user-input.ipynb ........ [ 51%]
[2024-10-16T18:16:50.274Z] ................. [ 57%]
[2024-10-16T18:16:58.051Z] pavics-sdi-master/docs/source/notebooks/subsetting.ipynb ...... [ 59%]
[2024-10-16T18:16:58.999Z] pavics-sdi-master/docs/source/notebook-components/weaver_example.ipynb . [ 59%]
[2024-10-16T18:17:16.806Z] ......... [ 62%]
[2024-10-16T18:17:26.310Z] finch-master/docs/source/notebooks/dap_subset.ipynb ........... [ 66%]
[2024-10-16T18:17:35.354Z] finch-master/docs/source/notebooks/finch-usage.ipynb ...... [ 68%]
[2024-10-16T18:17:36.744Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb . [ 68%]
[2024-10-16T18:17:39.764Z] ..... [ 70%]
[2024-10-16T18:17:54.688Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-2Subsetting.ipynb . [ 70%]
[2024-10-16T18:18:12.766Z] ............ [ 74%]
[2024-10-16T18:18:27.686Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-3Climate-Indicators.ipynb . [ 75%]
[2024-10-16T18:18:50.859Z] .....s. [ 77%]
[2024-10-16T18:18:59.014Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-4Ensembles.ipynb . [ 77%]
[2024-10-16T18:19:15.301Z] .. [ 78%]
[2024-10-16T18:19:21.923Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-5Visualization.ipynb . [ 78%]
[2024-10-16T18:20:23.213Z] ......... [ 81%]
[2024-10-16T18:20:33.219Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-6Regridding_Conversion.ipynb . [ 82%]
[2024-10-16T18:25:23.925Z] .... [ 83%]
[2024-10-16T18:25:23.925Z] PAVICS-landing-master/content/notebooks/hydrology/PAVICStutorial_Hydrology-01_Intro.ipynb . [ 83%]
[2024-10-16T18:25:23.925Z] .... [ 85%]
[2024-10-16T18:25:23.925Z] PAVICS-landing-master/content/notebooks/hydrology/PAVICStutorial_Hydrology-02_Calibration.ipynb . [ 85%]
[2024-10-16T18:25:29.145Z] ..... [ 87%]
[2024-10-16T18:25:34.453Z] PAVICS-landing-master/content/notebooks/hydrology/PAVICStutorial_Hydrology-03_Watershed_properties.ipynb . [ 87%]
[2024-10-16T18:26:02.631Z] ............. [ 91%]
[2024-10-16T18:26:05.951Z] PAVICS-landing-master/content/notebooks/hydrology/PAVICStutorial_Hydrology-04_Time_series_analysis.ipynb . [ 92%]
[2024-10-16T18:26:07.684Z] ...... [ 94%]
[2024-10-16T18:26:10.184Z] notebooks/hummingbird.ipynb ............ [ 98%]
[2024-10-16T18:28:49.817Z] notebooks/stress-tests.ipynb ...... [100%]
[2024-10-16T18:28:49.817Z]
[2024-10-16T18:28:49.817Z] =============================== warnings summary ===============================
Overview
Currently the default THREDDS configuration creates two default datasets, the Service Data dataset and the Main dataset. The Service Data dataset is used internally and hosts WPS outputs. The Main dataset is the place where users can access data served by THREDDS. Both of these are configured to serve files with the following extensions: .nc .ncml .txt .md .rst .csv
In order to allow the THREDDS server to serve files with additional extensions, this introduces two new variables:
THREDDS_SERVICE_DATA_EXTRA_FILE_FILTERS
: this allows users to specify additional filter elements to the Service Data dataset. This is especially useful if a WPS outputs files with an extension other than the default (eg: .h5) to thewps_outputs/
directory.THREDDS_DATASET_DATASETSCAN_BODY
: this allows users to specify the whole body of the main dataset's<datasetScan>
element. This allows users to fully customize how this dataset serves files.We limit the configuration options for the Service Data dataset more than the main dataset because the Service Data dataset requires a basic configuration in order to properly serve WPS outputs. Making significant changes to this configuration could have unexpected negative impacts on WPS usage.
The defaults for these new variables are fully backwards compatible. Without changing these variables, the THREDDS server should behave exactly the same as before.
Changes
Non-breaking changes
Breaking changes
Related Issue / Discussion
Additional Information
CI Operations
birdhouse_daccs_configs_branch: master birdhouse_skip_ci: false