ExpressionAnalysis / STAR-SEQR

RNA Fusion Detection and Quantification
Other
17 stars 12 forks source link

read Chimeric.out.junction ERROR,about pandas #25

Open ruizgo opened 3 years ago

ruizgo commented 3 years ago

Dec 05 09:41:16 ..... loading genome Dec 05 09:42:08 ..... started mapping Dec 05 09:53:04 ..... finished mapping Dec 05 09:53:04 ..... finished successfully

2020-12-05 09:53 - INFO - STAR Alignment Finished! 2020-12-05 09:53 - INFO - Importing junctions 2020-12-05 09:54 - ERROR - There was a problem reading your STAR *Chimeric.out.junction file 2020-12-05 09:54 - ERROR - Exception: could not convert string to float: NreadsUnique 33113327 Traceback (most recent call last): File "/usr/bin/starseqr.py", line 622, in sys.exit(main()) File "/usr/bin/starseqr.py", line 286, in main rawdf = su.core.import_starjxns(new_prefix + ".Chimeric.out.junction", args.keep_dups, args.keep_mito) File "/usr/lib64/python2.7/site-packages/starseqr_utils/core.py", line 55, in importstarjxns raise(ValueError, e, traceback) File "/usr/lib64/python2.7/site-packages/starseqr_utils/core.py", line 36, in import_starjxns df['pos1'] = df['pos1'].astype(float).astype(int) # this bypasses some strange numbers File "/usr/lib64/python2.7/site-packages/pandas/core/generic.py", line 5691, in astype kwargs) File "/usr/lib64/python2.7/site-packages/pandas/core/internals/managers.py", line 531, in astype return self.apply('astype', dtype=dtype, kwargs) File "/usr/lib64/python2.7/site-packages/pandas/core/internals/managers.py", line 395, in apply applied = getattr(b, f)(kwargs) File "/usr/lib64/python2.7/site-packages/pandas/core/internals/blocks.py", line 534, in astype kwargs) File "/usr/lib64/python2.7/site-packages/pandas/core/internals/blocks.py", line 633, in _astype values = astype_nansafe(values.ravel(), dtype, copy=True) File "/usr/lib64/python2.7/site-packages/pandas/core/dtypes/cast.py", line 702, in astype_nansafe return arr.astype(dtype, copy=True) ValueError: could not convert string to float: NreadsUnique 33113327

kskuchin commented 3 years ago

I had this same issue, as well as an error about an unexpected number of columns - both related to the core.import_starjxns() reading in of the junctions file.

I basically just cloned the repo, went into starseqr_utils/core.py and removed "header=None" in the pd.read_csv() call (line 31), and that seemed to fix it.