cisagov / cyhy-core

Core code for Cyber Hygiene (CyHy)
Other
7 stars 9 forks source link

Change the base URL used to retrieve GNIS data in the `load_places.sh` script #89

Closed mcdonnnj closed 6 months ago

mcdonnnj commented 6 months ago

🗣 Description

This pull request changes the base URL that is used to retrieve GNIS data for import into the database.

💭 Motivation and context

The current URL points to a server that will sometimes return a response that the service is unavailable. They now have this same data stored in the Amazon S3 bucket they use to distribute their various datasets. It makes sense to switch to this bucket as it should have no issues with availability compared to the current server.

🧪 Testing

I verified that I was able to pull in GNIS data in my testing environment when deploying a new database instance. I also downloaded the two versions and verified they matched both in size and SHA256 hash.

✅ Pre-approval checklist

dv4harr10 commented 6 months ago

Hi Team, I am getting the following error messages for the 2 updated Urls: https://www.usgs.gov/us-board-on-geographic-names/download-gnis-data returns the message "Site under maintenance". And https://prd-tnm.s3.amazonaws.com/StagedProducts/GeographicNames/Archive/TopicalGazetteers/ returns the message "This XML file does not appear to have any style information associated with it. The document tree is shown below."

dv4harr10 commented 6 months ago

Hi Nick @mcdonnnj , are you noticing the same issue for these Urls or getting something different ?

michaelsaki commented 6 months ago

Hi Team, I am getting the following error messages for the 2 updated Urls: https://www.usgs.gov/us-board-on-geographic-names/download-gnis-data returns the message "Site under maintenance". And https://prd-tnm.s3.amazonaws.com/StagedProducts/GeographicNames/Archive/TopicalGazetteers/ returns the message "This XML file does not appear to have any style information associated with it. The document tree is shown below."

@dv4harr10

This URL works for me: https://www.usgs.gov/us-board-on-geographic-names/download-gnis-data so not really sure what is going on there.

I believe the second URL is just for pulling info from to store in the database, note the name of the variable associated with it DATA_BASE_URL. So this is likely just used as an interface to interact with a database.

dv4harr10 commented 6 months ago

Hi Team, the first link is working for me now, it appears to be a temporary issue. Since the 2nd link is for the database per Michael this would be a non-issue also. Thanks

mcdonnnj commented 6 months ago

Hi Team, I am getting the following error messages for the 2 updated Urls: https://www.usgs.gov/us-board-on-geographic-names/download-gnis-data returns the message "Site under maintenance". And https://prd-tnm.s3.amazonaws.com/StagedProducts/GeographicNames/Archive/TopicalGazetteers/ returns the message "This XML file does not appear to have any style information associated with it. The document tree is shown below."

The second link is the base URL. We only use it to form the full URL we attempt to download in https://github.com/cisagov/cyhy-core/blob/d4df9a57ae05444d6d3e2a86514d60d12b42d6f8/var/load_places.sh#L50-L56

michaelsaki commented 6 months ago

Hi Team, I am getting the following error messages for the 2 updated Urls: https://www.usgs.gov/us-board-on-geographic-names/download-gnis-data returns the message "Site under maintenance". And https://prd-tnm.s3.amazonaws.com/StagedProducts/GeographicNames/Archive/TopicalGazetteers/ returns the message "This XML file does not appear to have any style information associated with it. The document tree is shown below."

The second link is the base URL. We only use it to form the full URL we attempt to download in

https://github.com/cisagov/cyhy-core/blob/d4df9a57ae05444d6d3e2a86514d60d12b42d6f8/var/load_places.sh#L50-L56

Oh that makes sense, should have guessed that with the underscore between "data" and "base". Thanks @mcdonnnj