cnobles / iGUIDE

Bioinformatic pipeline for identifying dsDNA breaks by marker based incorporation, such as breaks induced by designer nucleases like Cas9.
https://iguide.readthedocs.io/en/latest/
GNU General Public License v3.0
20 stars 9 forks source link

Quoted CSV files crash import_sample_info #48

Closed ressy closed 5 years ago

ressy commented 5 years ago

If my sampleInfo table has quoted fields (like R likes to make with write.csv for example) import_sample_info sees the quotes as part of the text and can't find the sampleName column:

ValueError in line 29 of /home/jesse/analysis/iguide-analyses/iGUIDE/Snakefile:
'sampleName' is not in list
  File "/home/jesse/analysis/iguide-analyses/iGUIDE/Snakefile", line 29, in <module>
  File "/home/jesse/miniconda3/envs/iguide/lib/python3.6/site-packages/iguidelib/__init__.py", line 11, in import_sample_info

The built-in csv module could help handle those sorts of formatting things automatically.

ressy commented 5 years ago

This is the DictReader class I mentioned: https://docs.python.org/3/library/csv.html#csv.DictReader (It defaults to taking the first row as the keys so it's a bit like read.table with header=TRUE but as a list of dicts instead of a data frame.)

cnobles commented 5 years ago

I think this is a good solution, I was wondering if it would also work with TSV files, and apparently it does. I want to read up on it more though.

I found another solutions using the replace('"', '') function on strings. Got rid of all the quotes and implemented it into the current definition I was using.

Updates in #53.