Alliance-for-Tropical-Forest-Science / DataHarmonization

Code to run the data harmonization app and support cross-site analysis
https://alliance-for-tropical-forest-science.github.io/DataHarmonization/
3 stars 1 forks source link

"status" is used ambiguously in step 2 of "headers and units" #26

Closed gabrielareto closed 1 year ago

gabrielareto commented 1 year ago

There are a couple of questions that say "which or your status(es) represent a LIVE tree?" and "which or your status(es) represent a DEAD tree?"

These two refer to two different variables, but they are phrased exactly the same. I think it is better to make more explicit links. Perhaps by numbering the questions within that particular block? "Which of you [etc.]; (see question x of block xxxxx)"

ValentineHerr commented 1 year ago

I am not sure I am understanding this question.

The reason for these two very similar questions is two fold:

Does that make sense @gabrielareto ?

gabrielareto commented 1 year ago

I think my comment is related to the wording and clarity of the question.

Assuming that those two questions refer to one column each, I would rephrase making explicit the connection with the variable, instead of simply using the word "status", that could refer to two different columns.

which of your status(es) represent a LIVE tree in [the live codes column]? which of your status(es) represent a DEAD tree in [the dead codes column]?"

or would it be this:?

which of your status(es) represent a LIVE tree in [the dead codes column]? which of your status(es) represent a DEAD tree in [the live codes column]?"

in any case, the words "live tree" or "dead tree" are not enough to link with the column to which each question refers, because dead and alive trees can be described in any of the columns or both.

I think there needs to be 1-to-1 between the variables and the questions. A "1" or "TRUE" can mean dead or alive, depending on where it happens. Please confirm.

The most common case, I think, is for all codes to be stored in the same column. We need to be extra clear here because the distinction between [the live codes column] and [the dead codes column] comes as a little surprise to some users.

ValentineHerr commented 1 year ago

@gabrielareto how about this?: image

It would be tricky and will likely cause issues to make what is in the brackets interactive (make it change to be the name of the column the user selected). If you think that is necessary I'll try.

gabrielareto commented 1 year ago

This is an improvement, please implement it, don't try to make interactive.

Is it possible to organize the questions so they are ordered like this:?

1/ what is the column that contains the LIVE status? 2/ what are the codes that describe a living tree in the column that contains the LIVE status? 3/ what is the column that contains the DEAD status? (it may be the same) 4/ what are the codes that describe a dead tree in the column that contains the DEAD status?

We could be more redundant and add more questions about codes: 1/ what is the column that contains the LIVE status? 2/ what are the codes that describe a living tree in the column that contains the LIVE status? 3/ what are the codes that describe a dead tree in the column that contains the LIVE status? 4/ what is the column that contains the DEAD status? (it may be the same) 5/ what are the codes that describe a dead tree in the column that contains the DEAD status? 6/ what are the codes that describe a living tree in the column that contains the DEAD status?

I am thinking about ForestPlots flags. If I remember correctly, both columns contain information to differentiate between dead and alive individuals. There is a special code in the "LIVE status" column that means "dead" and there is a special code in the "DEAD status" column that means "alive". Right? We could check for coherence within the app to make a final estimate of status.

ValentineHerr commented 1 year ago

Unfortunately, I can't organize the question this way because when something is changed on column 1 it resets column 2. so if we make the questions consecutive, answering 3 will erase 2. If I have more time I'll try to bypass this issue but for now I'll work on all other things.

I don't think the 6 questions are necessary. 4 questions are enough to deal with ForestPloit flags. I don't think we should check for coherence; this would be very site specific and require more decisions and interactions with the user (which column to trust?).

gabrielareto commented 1 year ago

Thanks for the clarification. I agree. As long as the questions are worded clearly we are ok.