International-Soil-Radiocarbon-Database / ISRaD

Repository for the development and release of ISRaD data and tools
https://international-soil-radiocarbon-database.github.io/ISRaD/
24 stars 15 forks source link

Soil Order Controlled Vocabulary #75

Closed coreylawrence closed 5 years ago

coreylawrence commented 6 years ago

It is proposed that we transition to using the 12 USDA soil orders as controlled vocabulary for pro_soil_taxon field. This would require updating the existing data entries to be consistent with this convention. There is code in development that would help this transition.

greymonroe commented 6 years ago

does the code change the template files directly or just the ISRaD_data object?

coreylawrence commented 6 years ago

Yes, it would convert the field in question to a controlled vocabulary. We need to decide if we are willing to commit to such a change.

jb388 commented 6 years ago

I think we've straightened this out: pro_soil_taxon remains as a "free text" field for recording more specific USDA soil taxonomy or WRB taxonomy. A new field, "pro_usda_soil_order" now exists in the template with controlled vocabulary. The ISRaD.extra.fill_soilorders function fills the new column from text entered in the pro_soil_taxon field.

However, many templates are still missing taxonomic data and will need to have these data filled by revisiting the original manuscripts or through the use of geospatial products. Consequently, the pro_usda_soil_order field is NOT required, as the templates missing these data would cause the compile function to fail.

Going forward, the ISRaD.extra.fill_soilorders function needs to be modified or a new function needs to be created to complete the task of filling the pro_usda_soil_order field in the derived ISRaD.extra product.

coreylawrence commented 6 years ago

I am ok with this solution for now. Though I think it best if we go back through the existing templates and add this information by hand, rather than fill via script.

We should probably merge the dev and master soonish so that the current template version severed via the web interface includes the pro_soil_order column.

jb388 commented 6 years ago

I'll get our students started on this tomorrow--i.e. looking up the data in the original manuscripts. Caitlin also offered to help. Note that the script (ISRaD.extra.fill_soilorder) only applies to the fraction studies where some form of soil taxonomy was entered in the "pro_soil_taxon" column. This leaves a lot of entries/profiles with missing data.

jb388 commented 6 years ago

@coreylawrence I merged dev with master, so template on website should be up-to-date.

jb388 commented 6 years ago

Update: I tasked one of our student workers with looking up soil taxonomy for the profiles that do not have any information entered in the existing pro_soil_taxon field. I gave him a basic training in parsing USDA soil taxonomy strings (i.e. looking for the soil order "keys"---'ept', 'alf', 'ox', etc.), and told him to fill in the pro_usda_soil_order field if possible. But, he is not an expert in soil, so I told him if the USDA soil order was not obvious, he should just fill in whatever information he could find in the pro_soil_taxon field. This can then be reviewed and converted to USDA soil orders later.

Once the templates have been updated, he will run QAQC with the online tool, and submit the updated template to the israd.info@gmail.com address.

greymonroe commented 5 years ago

status update? can we close?