monroews / CEE4530

MIT License
4 stars 11 forks source link

formatting the TSV file for the ANC lab #100

Open el545 opened 5 years ago

el545 commented 5 years ago

I'm kind of struggling with the markdown analysis. Our data files in .tsv format (we attached a screenshot of the format we're getting) aren't the same as the example tsv file, and we don't know to change that format so that we can do the data analysis.

screen shot 2019-02-27 at 8 37 42 pm
monroews commented 5 years ago

The epa functions are all designed to extract data from the files created by ProCoDA. I'm guessing that somewhere we are giving the misinformation that the files need to be modified. That is incorrect. The epa functions should all work if you use files that are unchanged from ProCoDA.

It would help me to know where the idea of changing the files to other formats came from.

jacqueline-wong commented 5 years ago

I am experiencing this issue as well. I am able to use the Gran function with Monroe's example file but not the one in my GitHub (https://raw.githubusercontent.com/lw583/CEE4530/master/Lab3/0_minute_sample.xls).


ValueError Traceback (most recent call last) pandas/_libs/lib.pyx in pandas._libs.lib.maybe_convert_numeric()

ValueError: Unable to parse string "Titrant Volume (ml)"

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)

in ----> 1 V_titrant, pH, V_Sample, Normality_Titrant, V_equivalent, ANC = epa.Gran(Gran_data_0) /anaconda3/lib/python3.6/site-packages/aguaclara/research/environmental_processes_analysis.py in Gran(data_file_path) 291 """ 292 df = pd.read_csv(data_file_path, delimiter='\t', header=5) --> 293 V_t = np.array(pd.to_numeric(df.iloc[0:, 0]))*u.mL 294 pH = np.array(pd.to_numeric(df.iloc[0:, 1])) 295 df = pd.read_csv(data_file_path, delimiter='\t', header=-1, nrows=5) /anaconda3/lib/python3.6/site-packages/pandas/core/tools/numeric.py in to_numeric(arg, errors, downcast) 133 coerce_numeric = False if errors in ('ignore', 'raise') else True 134 values = lib.maybe_convert_numeric(values, set(), --> 135 coerce_numeric=coerce_numeric) 136 137 except Exception: pandas/_libs/lib.pyx in pandas._libs.lib.maybe_convert_numeric() ValueError: Unable to parse string "Titrant Volume (ml)" at position 0
monroews commented 5 years ago

Is this the original file or did you open it and save it again in Excel?

monroews commented 5 years ago

I've compared your files with the original files and your files have extra tabs inserted. There shouldn't be any tabs inserted at the end of ANY lines.

You can manually fix the file by deleting the tabs at the end of lines.

jacqueline-wong commented 5 years ago

Just re-downloaded the files (I took them from someone else) and directly uploaded it onto Github, and it worked. Thank you!