VincyaneBadouard / TreeData_broken

Harmonization and correction forest data tool.
https://vincyanebadouard.github.io/TreeData/
0 stars 1 forks source link

Testing transforming from one format/ profile to another #54

Closed GeoPicka closed 1 year ago

GeoPicka commented 2 years ago

@ValentineHerr data download is incomplete (missing columns) when transforming ForestPlots data to SEOSAW profile and vice versa. Also the metadata download file has some missing info,

Data: Missing columns: D, HOM, LifeStatus

Metadata issues:

e.g. ForestPlots test data: ForestPlots_test13.1_trees_small.csv ForestPlots_test13.1_plots_small.csv

ForestPlots profile: ForestPlots_Profile_NOCODES_4sept.zip

SEOSAW profile: SEOSAW_t4_Profile.zip

Downloaded (incomplete) data files should be in SEOSAW format but has missing data and metadata errors previously described ForestPlots_test13.1_SEOSAWt4.zip

Note that this is only a problem with the downloads. The data table viewed in the screen appears to be complete and contain all of the columns, only the downloaded data is missing. Also, only an issue when switching from one network format to another. Downloads appear correct and complete for both network test data (ForestPlots and SEOSAW) when downloaded in standard format. I also tried switching to the app's standrad in between (rather than directly to a different network profile) and I get the same issues- missing data columns. I was unable to test downloading in the app's standard and then reupload before switching network profile as unable to select the app's standard in columns/units section.

GeoPicka commented 2 years ago

I got similar issues with the downloads after uploading SEOSAW data and transforming to ForestPlots profile (same profiles in previous comment).

Download_SEOSAWdata_to_ForestPlotsFormat.zip

GeoPicka commented 2 years ago

@ValentineHerr Note that when switching from one network format / profile to another, the app closes if 1 profile contains data codes but the other profile doesn't. E.g. ForestPlots has codes but SEOSAW doesn't (as no codes in the sample data received so far). I got around this by creating an additional profile for ForestPlots without the codes, so that the 2 profiles were then compatible. But something I thought I should let you know incase a bug or something everyone should be aware of

GeoPicka commented 2 years ago

Hi @ValentineHerr, I have been testing the Codes functionality more closely. All working fine when only 1 of the profiles has codes. However, if both profiles have codes there are a few teething issues with the translation process and downloads:

During code translation in the app:

Downloaded files after translation:

Let me know if you'd like any of the files/profiles to recreate these issues.

ValentineHerr commented 2 years ago

@GeoPicka, these are all fair requests! I'll see what I can do.

Could you send me the ForestPlots and SEOSAW profiles you used when you detected these issues? I think I probably have the data I need.

GeoPicka commented 2 years ago

Hi @ValentineHerr . During translation testing of ForestPlots, SEOSAW and Standard profiles I have a few thoughts. Putting them here to keep a record. Perhaps you have already considered and decided against some or also perhaps it's for the group to consider once we have more profiles. They are quite specific to the profiles I've been testing so feel free to ignore until we have a wider perspective....

a) Potential bugs/translational issues • Missing info in app’s standard : certain ‘missing’ info should be able to be calculated by the app’s standard:

b) App development requests

ValentineHerr commented 2 years ago
  • [ ] Translated columns missing from data.csv download (but visible in view data table after 'apply translation' before download)

@GeoPicka, I think this may be fixed with my latest clean up of the app, let me know if not. If it is you can tick it in the comment where this issue appears

  • [x] Incorrect OutputValue in 'tree_codes_translation' download file when translating from ForestPlots to SEOSAW (F1=b == stem_mode=P, but in the download file the OutputValue incorrectly shows mode instead of P) For some reason it's pulling the last word of the column header rather than the translated code. This is not an issue when translating the same profiles in the opposite direction.

this should be fixed

ValentineHerr commented 2 years ago

@GeoPicka now you should be able to upload a .csv of to fill in the code translation table. That .csv should have the same headers as the one that is downloaded after using the app. I paste here the one you send me except I saved the "CORRECTED" tab of your excel file and fixed the header names.

tree_codes_translation_FPlotsToSEOSAW.csv

For the user-friendliness of this code translation table, you can now wrap the rows so you don't have to scroll as much up/down, but the left/right scrolling is hard to get rid of.... when I separate each set of column into tabs (which looks great), the functionality of the radio buttons fails... My next idea is to color the columns based on what column they belong to in the output profile... and have a legend that you can refer to when the header indicating the column name disappears on the left. Do you think that would be useful?

For now I am going to focus on your other requests.

GeoPicka commented 2 years ago
  • [ ] Translated columns missing from data.csv download (but visible in view data table after 'apply translation' before download)

@GeoPicka, I think this may be fixed with my latest clean up of the app, let me know if not. If it is you can tick it in the comment where this issue appears

  • [x] Incorrect OutputValue in 'tree_codes_translation' download file when translating from ForestPlots to SEOSAW (F1=b == stem_mode=P, but in the download file the OutputValue incorrectly shows mode instead of P) For some reason it's pulling the last word of the column header rather than the translated code. This is not an issue when translating the same profiles in the opposite direction.

this should be fixed

Hi @ValentineHerr- sorry the 1st issue here still isn't fixed- the translated columns are missing from the data download file (but visible in View table through the app prior to download). The 2nd issue (output value in the tree codes translation download file) is fixed , thank you

GeoPicka commented 2 years ago

@GeoPicka now you should be able to upload a .csv of to fill in the code translation table. That .csv should have the same headers as the one that is downloaded after using the app. I paste here the one you send me except I saved the "CORRECTED" tab of your excel file and fixed the header names.

tree_codes_translation_FPlotsToSEOSAW.csv

For the user-friendliness of this code translation table, you can now wrap the rows so you don't have to scroll as much up/down, but the left/right scrolling is hard to get rid of.... when I separate each set of column into tabs (which looks great), the functionality of the radio buttons fails... My next idea is to color the columns based on what column they belong to in the output profile... and have a legend that you can refer to when the header indicating the column name disappears on the left. Do you think that would be useful?

For now I am going to focus on your other requests.

Thank you @ValentineHerr - uploading the code translation file is great. Mostly working perfectly apart from: unable to overwrite an (incorrectly) automatically matched translation with a NULL match from the uploaded translation file. e.g. when testing ForestPlots to SEOSAW conversion, the app incorrectly identifies all NA and blank values for any code (eg F3, F4, LI, CF) as translated to decay = NA. The uploaded code translation csv files should overwrite all of these to blank translated values i.e. overwrite decay =NA with NULL value- no match, but instead the decay = NA remains for all and the user must remove the automated matches manually. Is this possible to fix?

ValentineHerr commented 2 years ago

the translated columns are missing from the data download file (but visible in View table through the app prior to download).

@GeoPicka Can you make sure the columns that are not in the download actually have data in the View table? I have been removing columns that are all NAs in the download.

ValentineHerr commented 2 years ago
  • New column request: PlotDataID / PlotID: Useful if networks have different snapshots of the data e.g. ForestPlots as PlotViewID - useful for allowing analysis of data that has changed protocol and/or multiple PlotViews of same plot eg different soil type/disturbance etc

@GeoPicka, I am working on this (I added the Cluster already, which will be pushed soon). Can you explain to me how this would affect other parts of the data? Mainly I am thinking about IdTree. If I understand this column correctly, this means that for a same year you could have 2 measurements of the same tree but with 2 different protocole? is that correct? If yes, that means that when I am assigning Tree IDs where they are missing, I need to make sure to not only look at census ID but also look at PlotViewID, right?

GeoPicka commented 1 year ago

the translated columns are missing from the data download file (but visible in View table through the app prior to download).

@GeoPicka Can you make sure the columns that are not in the download actually have data in the View table? I have been removing columns that are all NAs in the download.

@ValentineHerr thank you for fixing that- the translated columns are now appearing in the download data.csv file. However, the values in the translated columns aren't quite right. e.g. for ForestPlots to Seosaw the translated values appear to be a concatenation of all input code column values, with a couple of translated values replacements, rather than just the translated value from the appropriate column. Please see data file with each translated column as it appears in the data download (red) vs what the translated value should be (green) based on the translated code file (also included as a worksheet). (Please let me know if you need any further info/data/profile) data_GPcheck.xlsx

GeoPicka commented 1 year ago
  • New column request: PlotDataID / PlotID: Useful if networks have different snapshots of the data e.g. ForestPlots as PlotViewID - useful for allowing analysis of data that has changed protocol and/or multiple PlotViews of same plot eg different soil type/disturbance etc

@GeoPicka, I am working on this (I added the Cluster already, which will be pushed soon). Can you explain to me how this would affect other parts of the data? Mainly I am thinking about IdTree. If I understand this column correctly, this means that for a same year you could have 2 measurements of the same tree but with 2 different protocole? is that correct? If yes, that means that when I am assigning Tree IDs where they are missing, I need to make sure to not only look at census ID but also look at PlotViewID, right?

@ValentineHerr great- thank you so much for considering adding this. In terms of how ForestPlots works (for analysis/data sharing), Plot ViewID wouldn't impact you assigning missing treeID- there should be no spatial overlap between different plot views in a dataset being shared/prepared for analysis. In ForestPlots, data can only be downloaded as main plot view (all data including each tree, stem, census as measured in the field) or preferred plot view (the plot view that best describes the plot for analysis which accounts for any change in protocol). There can be many plot views for a plot stored in ForestPlots but only 1 preferred plot view available to download and analyse/share. (In rare cases a plot might have >1 preferred plot view but only if there is no spatial overlap eg if a plot contains different soil types in different parts of the plot. In which case the different plot views would contain different trees- there is no spatial overlap.) So for ForestPlots data and the app you wouldn't need to consider any affect on other parts of the data- see it more as an additional plot protocol identifier. But I'm not sure if useful to other networks or if other networks use anything comparable and if so how that might affect other parts of their data. Hope that helps, let me know if unclear.

ValentineHerr commented 1 year ago

thank you for fixing that- the translated columns are now appearing in the download data.csv file. However, the values in the translated columns aren't quite right. e.g. for ForestPlots to Seosaw the translated values appear to be a concatenation of all input code column values, with a couple of translated values replacements, rather than just the translated value from the appropriate column.

Thanks for pointing that out @GeoPicka. I believed I fixed that issue. The main problem was that the code I wrote for this was based on ForestGEO data, which only has one-letter tree codes, and I did not anticipate tree codes could be multiple letters and numbers like you have. Sorry about that!

GeoPicka commented 1 year ago

thank you for fixing that- the translated columns are now appearing in the download data.csv file. However, the values in the translated columns aren't quite right. e.g. for ForestPlots to Seosaw the translated values appear to be a concatenation of all input code column values, with a couple of translated values replacements, rather than just the translated value from the appropriate column.

Thanks for pointing that out @GeoPicka. I believed I fixed that issue. The main problem was that the code I wrote for this was based on ForestGEO data, which only has one-letter tree codes, and I did not anticipate tree codes could be multiple letters and numbers like you have. Sorry about that!

Hi @ValentineHerr Now working great- thank you for fixing