Closed mcannister-usgs closed 2 years ago
We currently interpret a # symbol anywhere in a csv as a comment, such that any info after the pound will get ignored. This is clearly not always the appropriate response.
Ultimately it would be nice to expose this and (some of) the other 50 ish parameters that Pandas.read_csv exposes to handle the variability possible in the CSV format. Largely we just take the defaults. This is where you would change that functionality in the code: https://github.com/usgs/fort-pymdwizard/blob/master/pymdwizard/core/data_io.py#L102 but you would need to implement some form of UI for users to enter these parameters, probably on the current settings form.
Currently, your only option would be to make a copy of the csv with the pound replaced. Run the wizard against that version and swap out the character you replaced it with in the EA section.
Closing ticket with notes from Colin's comment above. Reported issue is caused by interpretation of # sign in a file. Will note issue with @ennsk and @tnorkin to be aware of.
I've attached a couple CSVs that I tried to read into the tool today. The first few column labels are read correctly but not all columns are listed. The values described under the columns do not line up with the values from the original files. Screenshot below loads headers from first 4 columns but loads data from the final 4 columns in the dataset.
LDWF Data.zip