internetofwater / nldi-services

Network Linked Data Index Navigation Web Services
https://waterdata.usgs.gov/blog/nldi-intro/
Creative Commons Zero v1.0 Universal
19 stars 15 forks source link

Docker demo not working as expected. #387

Closed dblodgett-usgs closed 1 year ago

dblodgett-usgs commented 1 year ago

I feel like I must be missing something.

If I do:

docker-compose up -d nldi-db
docker-compose up nldi-services

I know the feature I'm requesting is in the demo database -- I add the 5432 port to docker compose and connect with pgadmin and see it in the table.

I found a query in logs and tried it on my local database.

select count(*) != 0 from nldi_data.feature join nldi_data.crawler_source on feature.crawler_source_id = crawler_source.crawler_source_id where identifier = '070900020501' and lower(source_suffix) = lower('huc12pp')

Returns FALSE locally but TRUE on the dev database.

I get a row when I try:

select * from nldi_data.feature_huc12pp where identifier = '070900020501'

But something isn't right with the query that goes through the nldi_data.feature table.

I did verify that things are hooked up.

http://localhost:8080/nldi/linked-data/comid/13296606?f=json does return data.

EthanGrahn commented 1 year ago

Are you requesting /nldi/linked-data/huc12pp/070900020501? It's possible that the demo data is stale as I've seen that before. I'd recommend checking that the crawler source table has the correct information for huc12pp, if at all.

dblodgett-usgs commented 1 year ago

I think we had some streams crossed with the database configuration. @webb-ben re you able to pick up where I left off on the database configuration documentation? I meant to make more time for it but haven't been able to.

webb-ben commented 1 year ago

@dblodgett-usgs Happy to cover more database documentation. Would you help me by clarifying the image tags you are using for local and dev?

I can confirm that as of two weeks ago, ghcr.io/internetofwater/nldi:demo did not work out of the box. That was because the nldi_data.crawler_source table had a different crawler_source_id than the corresponding data in the feature table. The artifacts used for internetofwater/nldi-db:demo, insert features directly into their corresponding feature table without using the nldi_data.crawler_source table to synchronize the crawler_source_id.

SELECT crawler_source_id, source_suffix FROM nldi_data.crawler_source ORDER BY crawler_source_id previously returned: crawler_source_id source_suffix
1 WQP
2 huc12pp
5 nwissite
6 wade
7 ref_gage
8 ca_gages
9 gfv11_pois
10 vigil
11 nwisgw
12 nmwdi-st
13 geoconnex-demo
With https://github.com/internetofwater/nldi-db/pull/106, SELECT crawler_source_id, source_suffix FROM nldi_data.crawler_source ORDER BY crawler_source_id now returns: crawler_source_id source_suffix
1 WQP
3 nwissite
4 huc12pp

You can also confirm this by comparing the differences between SELECT crawler_source_id, COUNT(*) AS feature_count FROM nldi_data.feature GROUP BY crawler_source_id; and SELECT crawler_source.crawler_source_id, COUNT(*) AS feature_count FROM nldi_data.feature JOIN nldi_data.crawler_source ON feature.crawler_source_id = crawler_source.crawler_source_id GROUP BY crawler_source.crawler_source_id;

dblodgett-usgs commented 1 year ago

I did not make any changes to the demo-scale artifacts. I think they should be release artifacts going forward for accessibility. The prod scale artifacts are available from the requester pays S3 bucket as mentioned in the readme of the nldi-db repo in this PR https://github.com/internetofwater/nldi-db/pull/105/files

dblodgett-usgs commented 1 year ago

Based on conversation at tag up today, I think we can close this. Fixed by: https://github.com/internetofwater/nldi-db/pull/106