VincyaneBadouard / TreeData_broken

Harmonization and correction forest data tool.
https://vincyanebadouard.github.io/TreeData/
0 stars 1 forks source link

data dictionary for input #2

Closed gabrielareto closed 2 years ago

gabrielareto commented 2 years ago

The translation tool will get data in format X and return data in format Y. The output should include a description of format Y (i.e. a data dictionary for format Y). Do we need a data dictionary for format X? If we have it, it may reduce the guessing that we have to do. For example, we don't have to guess the DBH units or the minimum DBH if we have a good data dictionary for format X, and we can tell errors right away.

It can be a general solution. The machine would tell the user X: "let's build your data dictionary first and then we can talk details".

Example:

Your first column is called xxxxx. Select the data type [dropdown menu with numeric, character, etc.]

[user interacts]

What is the minimum acceptable value in xxxx? Maximum?

[user interacts]

What are the units?

[user interacts]

Etc. A long conversation...

ValentineHerr commented 2 years ago

Yes, we will provide a data dictionary/metadata file with the output (Y)

For the input (X), we shouldn't need a data dictionary as we are hoping that all the interactions with the users will be enough to know everything we need/want to know. (Which also means that not everything that is in the data they upload will be used, e.g. comments, surveyors etc... at least not for now) I am closing this issue and will start a new one to remind us to have a METADATA ready to provide with output.