AI-sandbox / gnomix

A fast, scalable, and accurate local ancestry method.
Other
81 stars 13 forks source link

Error while training #47

Open EfraMP opened 6 months ago

EfraMP commented 6 months ago

I have been testing gnomic in training mode, and got the next output:

...
--------------------------------------------------------------------------------
-----------------------------------  Gnomix  -----------------------------------
--------------------------------------------------------------------------------
When using this software, please cite: 
Helgi Hilmarsson, Arvind S Kumar, Richa Rastogi, Carlos D Bustamante, 
Daniel Mas Montserrat, Alexander G Ioannidis: 
"High Resolution Ancestry Deconvolution for Next Generation Genomic Data" 
https://www.biorxiv.org/content/10.1101/2021.09.19.460980v1
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Traceback (most recent call last):
  File "gnomix.py", line 358, in <module>
    config = yaml.load(file, Loader=yaml.UnsafeLoader)
  File "/cm/shared/apps/anaconda3/2021.05/envs/amarin37/lib/python3.7/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/cm/shared/apps/anaconda3/2021.05/envs/amarin37/lib/python3.7/site-packages/yaml/constructor.py", line 41, in get_single_data
    node = self.get_single_node()
  File "/cm/shared/apps/anaconda3/2021.05/envs/amarin37/lib/python3.7/site-packages/yaml/composer.py", line 35, in get_single_node
    if not self.check_event(StreamEndEvent):
  File "/cm/shared/apps/anaconda3/2021.05/envs/amarin37/lib/python3.7/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/cm/shared/apps/anaconda3/2021.05/envs/amarin37/lib/python3.7/site-packages/yaml/parser.py", line 143, in parse_implicit_document_start
    StreamEndToken):
  File "/cm/shared/apps/anaconda3/2021.05/envs/amarin37/lib/python3.7/site-packages/yaml/scanner.py", line 116, in check_token
    self.fetch_more_tokens()
  File "/cm/shared/apps/anaconda3/2021.05/envs/amarin37/lib/python3.7/site-packages/yaml/scanner.py", line 260, in fetch_more_tokens
    self.get_mark())
yaml.scanner.ScannerError: while scanning for the next token
found character '\t' that cannot start any token
  in "/path/training.smap", line 1, column 8

Which seems really weird to me. Indeed, the training.smap file does not contain a '\t' string, and clearly is doesn't have more than two columns.

dralhindi commented 4 months ago

I've had this happen before and I think I worked around it by remaking the file using awk and making sure it's " " delimited. Also, not sure if it will help, but maybe check if there is an extra '\n' at the end of the file? My smap files only have 2 columns though (sampleID and the reference the sample is assigned to).