HDFGroup / hsds

Cloud-native, service based access to HDF data
https://www.hdfgroup.org/solutions/hdf-kita/
Apache License 2.0
128 stars 52 forks source link

fix reference to dn_urls in k8s_update_dn_info #102

Closed jananzhu closed 2 years ago

jananzhu commented 2 years ago

In k8s_update_dn_info there is a loop that's supposed to make an /info request to all of the discovered DN IPs to verify that the DN has initialized successfully before adding it to the DN list. This currently incorrectly references the global copy of the list instead of the new list obtained via the k8s API and so a DN that is not yet ready can be added to the dn_url list and saved back into the global, causing a mismatch between dn_urls and dn_ids.

We discovered this when testing scaling the HSDS cluster while several hsload requests were in flight - this caused a crash in the s3syncCheck task when it made a notify root request to a DN that returned a 503 error which was not handled.