Closed kheal closed 1 month ago
Just as a reminder, I find it easiest to review by looking at the rendered notebook in the branch (https://nbviewer.org/github/microbiomedata/notebook_hackathons/blob/neon_metadata_keyerror/NEON_soil_metadata/python/neon_soil_metadata_visual_exploration.ipynb) and its associated google colab (https://colab.research.google.com/github/microbiomedata/notebook_hackathons/blob/neon_metadata_keyerror/NEON_soil_metadata/python/neon_soil_metadata_visual_exploration.ipynb).
Once the PR is merged into main, the notebooks will be accessible via the links in this readme: https://github.com/microbiomedata/nmdc_notebooks/tree/main/NEON_soil_metadata#readme.
@kheal are the changes in cell 4?:
# Check if samp has keys that correspond to primary metadata
if set(['lat_lon', 'geo_loc_name', 'collection_date']).issubset(samp):
Yep! And then a few spots down below that used to reference all_results
now reference all_full_results
.
Closes #28.
Issue arose from bio samples that must have been ingested after our initial creation of the notebooks. Some of the new bio samples do not have slots for
geo_loc_name
and are throwing a key error. I've added a check that subsets samples only for those with acollection_date
,geo_loc_name
andlat_lon
before adding those data to downstream analyses.