VincyaneBadouard / TreeData_broken

Harmonization and correction forest data tool.
https://vincyanebadouard.github.io/TreeData/
0 stars 1 forks source link

Statuses and codes conversions #44

Closed ValentineHerr closed 2 years ago

ValentineHerr commented 2 years ago

This is a first sketch of ideas for dealing with Statuses and Code, and I need feedback about this @gabrielareto, @cpiponiot and @VincyaneBadouard

Basic Alive vs Dead --> LifeStatus, logical TRUE/FALSE

We are already doing this by asking: "What column stores the information about whether a tree is dead or alive?", and then asking, "in that column, which code(s) represent a tree that is alive?" We should be more specific about this an maybe saying:

Live Codes --> LiveCodes

character codes indicating things about a tree that is alive (e.g. leaning, unhealthy, etc...)

Dead Codes --> DeadCodes

character codes indicating things about a tree that is dead or not avaibable for census (e.g. mode of death, broken below, not found etc...)


Then the idea would be to have the user enter the meaning of each LiveCodesand DeadCodes and work on making the connexion with the output's codes.... that would be a pretty big and tedious work but might be worth the effort...

gabrielareto commented 2 years ago

LifeStatus should be TRUE/FALSE = alive / dead. This is the most important aspect that we may want to store. We should not mix this important variable (inherently binary) with other types of information. I.e. a tree broken at 1.29 m should NOT be encoded as "dead" here.

some people keep codes that mean "there is a good reason why DBH = NA" or "there is a good reason why DBH = 0 mm". This may be relevant for us because in the correction step we may input missing DBHs or fit models for growth, etc. Other than that, the "broken below" code is not different to other status codes like "it is leaning", "it is infested by lianas", etc. It is not appropriate to try to fit it into the LifeStatus code.

we would solve many problems if we find the way to allow mapping between [all categories in the network variables] and [all categories in the user's variables]. Categories can be distributed into one or more variables, both in the user data or the network format.

ValentineHerr commented 2 years ago

Question: I now have a table showing all of the codes that exist in the column(s) chosen by the user. Right now, I have the table editable, so the user can enter the definition of each code. This is potentially a pretty long process. Would it be best to have a dropdown menu filled with pre-written definitions? (I think I can find a way to also give the option to write a custom definition.) The advantages would be 1) standard definition for commons codes, and 2) the mapping of codes will pre-filled-out after the user selects the output profile, for codes that have matching definitions. The disadvantage would be that we would need to come up with a bank of definitions. I'll need help with that.

cpiponiot commented 2 years ago

I think a mix of predefined codes and custom options is the best solution. We could define 10-20 common codes with the working group. We can first gather this information from the data dictionaries (Gabriel or I could do this), come up with a first selection, and then discuss it with everyone.

ValentineHerr commented 2 years ago

Thanks @cpiponiot for the feedback! I started a first set of codes here. Feel free to modify. I added the column "Source" just so wee know where the definition is coming from but it si not required to fill that up if we make our own definitions.

ValentineHerr commented 2 years ago

Update on this issue. this still needs work but the main structure is here:

  1. If the user specified a columns with "codes" the app will ask to select (or make up) definitions for each code of each column. If those definitions are already saved in the userprofile, there is a button to auto-populate that.

Codes

  1. At the end, if the user selects an output profile that does have codes, he/she will have to match his/her code to the codes in the output profile. This is a n-1 relationship for now... can't do n-n right now... That table is auto-populated if the definitions that are selected in step 1 match the ones from the output profile. But the user is prompt to double check and complete.

TranslateCodes

  1. I still need to work on actually translating the codes in the data.
ValentineHerr commented 2 years ago

oh and I need to help the user with this last step... in the table with radio buttons... because at that point the user has no idea what the codes in the headers mean. I am hoping I can have a pop-up help text that would show the definition for those codes.

And... right now it is not handling properly the case(s) where two columns main have the same code but not the same meaning.

ValentineHerr commented 2 years ago

Closing this, as it is done. It needs testing but we can open new issues if that are problems