ncss-tech / SoilTaxonomy

A System of Soil Classification for Making and Interpreting Soil Surveys
https://ncss-tech.github.io/SoilTaxonomy/
GNU General Public License v3.0
14 stars 2 forks source link

Refine procedures for building internal datasets #28

Closed brownag closed 1 year ago

brownag commented 3 years ago

As I have tacked more things on to the "rebuild R package datasets" script it has become more complicated.

There are currently several steps involved in rebuilding full set of datasets. For instance extracting/preparing dictionaries for formative elements from NASIS domains is a separate script and uses CSVs as an intermediate. A couple of my datasets are pulled/derived from SoilKnowledgeBase. Some of the logic currently stored in the dataset building script might be better offloaded to the KST parser. Curious if parsing could be improved using information pulled from NASIS domains...

Also I think I want to standardize on having a raw (flat file) data sets for all internal datasets, and have use something similar to the usethis::use_data_raw setup

brownag commented 3 years ago

https://github.com/ncss-tech/SoilTaxonomy/pull/23 shows some effects/gory detail with the current implementation of parsing class names from NASIS domains and definitions from SKB

brownag commented 1 year ago

The data-raw scripts have been set up. There is not an immediate issue here, and some of the parsing will need to be updated when KST 13th edition is eventually released, but that can be dealt with when those changes are implemented and published.