traitecoevo / austraits.build

Source for AusTraits
Other
16 stars 2 forks source link

Add `custom_R_code` workaround for problem with duplicate context values #749

Closed yangsophieee closed 9 months ago

yangsophieee commented 1 year ago

Currently if you have multiple context values with the same value: field (but maybe different find: fields) AusTraits will not collapse those duplicate values when building the austraits$contexts table. As there are non-unique rows this results in a wide table that has list data types where there should be character data types.

I've added custom_R_code to remove the duplicate context values from the context metadata but for the future we will try to fix the AusTraits building process.

codecov-commenter commented 1 year ago

Codecov Report

Patch and project coverage have no change.

Comparison is base (f977ea9) 80.24% compared to head (a3d8e59) 80.24%.

:exclamation: Current head a3d8e59 differs from pull request most recent head 7d749ca. Consider uploading reports for the commit 7d749ca to get more accurate results

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #749 +/- ## ======================================== Coverage 80.24% 80.24% ======================================== Files 7 7 Lines 1534 1534 ======================================== Hits 1231 1231 Misses 303 303 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

ehwenk commented 1 year ago

AusTraits now fully pivots wide again, as long as plot_id is included in the list of pivot variables. Previously plot_id was not included, as it is used to create population_id's . However, Yang_2023 is an example of a study without "locations" (species-wide values), but there are some plots contexts. In this study, population_id's remain NA. It seems that we might been to tweak process to have population_id's created from plot_id (or treatment_id) even if there aren't any locations specified. @yangsophieee can you add this as an issue to traits.build?

The fixes on this branch might not fully solve the second problem of some numeric context values not being read in. These will be instances where context provides background information for a trait value, but is not required to uniquely identify rows of data. A dataset test to create for this would be, for each category, to determine how many distinct context value combinations there are within a dataset for each context category and confirm that is equal to the number of id's generated (i.e. for each type of context identifier in austraits$traits). @yangsophieee can you add this as a test we should add to traits.build?

@yangsophieee @dfalster This branch does need to be merged in before any releases, even minor ones, because otherwise the austraits functions won't work.