uaf-arctic-eco-modeling / dvm-dos-tem

A process based Dynamic Vegetation, Dynamic Organic Soil, Terrestrial Ecosystem Model.
MIT License
22 stars 24 forks source link

another small issue with some weird quotes using --csv2fwt-v1 #682

Open jsclein-uaf opened 9 months ago

jsclein-uaf commented 9 months ago
// CMT52 // Heath-lichen // ##Values from Trail Valley###
//  dvmdostem parameters // v0.7.0-183-gc44a88a1
//  cmtdescription // ""A long winded description. Spaces? Quotes? Special charachters?"""
//  calibration site // ""Trail Valley"""
//  calibration notes // ""Calibrator+VB and JC, Nov 2023”"
//  references file // refs.bib

the above is what the header looks like when you run param.py --csv2fwt-v1.

BUT I am not sure why in the fifth line of the header //calibration notes, after '2023 the first set of quotes are some weird font that changes the codec or something like that I don't totally get it, but I had to remove those and then it runs I think it doesn't recognize it as ascii or something....

jsclein-uaf commented 9 months ago

not even sure if we want to keep all these header lines of not although that's another story

tobeycarman commented 9 months ago

@jsclein-uaf we do want the header, all lines of them, or at least the support for them. You will notice that in the "template" there is some junk text ("A long winded description..."). The idea is that when you are working in the csv file you will update the csv file with appropriate comments describing the calibration. It would be most valuable if you were to update these headers appropriately for each calibration you work on. This includes removing the irrelevant sample text that is there. I think it is still up for debate how much info and what type of info should get stored in the header, so I would appreciate it if you were to fill it out with the info that seems appropriate - then we can make sure that the --csv2fwt-v1 functionality can properly handle the conversion.

Re: quotes and other special characters, this is always going to be a wrinkle - when you export from Excel (or other spreadsheet) there are frequently options about how you want the fields to be handled - quoted or not, single or double, stuff like that. And I suspect that some special characters and spaces will usually get quoted.

Also note that I think avoiding having any string in the header comments that contains "CMT" would be a good idea because the parser that reads the fixed width text looks for the "CMT" string to know where to split the sections up.

tobeycarman commented 9 months ago

Also note that it is entirely possible to put all sorts of stuff in your .xl (or google sheets, or other spreadsheet file). You can have all kinds of formatting, formulas, extra characters, and other things that simply will never translate to .csv. So the idea is that you work in the .xl file because it is easier to do certain operations (sorting by certain columns, bulk updating values, etc), but at the end of the day you need to be able to export a "clean" csv file that the param.py --csv2fwt-v1 tool can handle. So I think it will be easiest if you try to keep the spreadsheet file relatively simple and clean so that the export to csv (and subsequent translation to fixed width text) works smoothly.