update load_crns with different file types

lgsmith295 / pcrecon

Principal Component Regression analysis for tree ring data

MIT License

2 stars 0 forks source link

update load_crns with different file types #23

Open lgsmith295 opened 4 years ago

lgsmith295 commented 4 years ago

Not needed now, but later update function to parse and select based on type codes, i.e. the single letters at the end of crn files:

Tree Ring Measurement Type Codes Code Measurement Type D Total Ring Density E Earlywood Width I Earlywood Density L Latewood Width N Minimum Density R Ring Width T Latewood Density X Maximum Density P Latewood Percent Tree Ring Chronology Type Codes Code Chronology Type A ARSTND P Low Pass Filter R Residual S Standard W Re-Whitened Residual N Measurements Only

Originally posted by @lgsmith295 in https://github.com/lgsmith295/pcreg/issues/1#issuecomment-612460772

djhocking commented 4 years ago

This is the worst. So NM01r.crn could be a residual chronology or raw ring widths (averaged across trees to form a chronology?). P can also be one of two things. Plus the filenames have no separators creating lots of fun for parsing since they don't have a set number of characters for the front (2 letters in the US but 4 or 5 in Canada, 0-3 numbers, 0-1? letters or any of those with "-noaa"). Wow, just wow.

The ultimate solution will be to parse the metadata files to find the characteristics you want and then extract the full filename from that. Although, I don't have much hope that the metadata are consistent enough to do this cleanly.

djhocking commented 4 years ago

Also the Standard chronology usually doesn't have any letters appended to the end. So a standard chronology could look like NM.crn, NM01.crn, NM123.crn, NMs.crn, NM01s.crn, or NM123s.crn.

djhocking commented 4 years ago

For now I will use

      crns[i] <- switch(type,
                        standard  = paste0(crns[i], ".crn"),
                        residual = paste0(crns[i], "r", ".crn")
                        )

where crns is a vector of chronology names without a file ending or residual codes. It assumes that most standard chronologies do not have the s ending.

The function will also skip this step if it is fed a list of full filenames ending in .crn so that someone can create a folder of just the files they want and then say crns = list.files(dir) or something like that.