dputhier / pygtftk

A python package and a set of shell commands to handle GTF files
GNU General Public License v3.0
45 stars 6 forks source link

Pb when using chromosome without 'chr' prefix #144

Closed dputhier closed 3 years ago

dputhier commented 3 years ago

xed types.Specify dtype option on import or set low_memory=False. Traceback (most recent call last): File "/amuhome/puthier/.local/bin/gtftk", line 110, in args = main() File "/amuhome/puthier/.local/bin/gtftk", line 94, in main CmdManager.run(args) File "/amuhome/puthier/.local/lib/python3.7/site-packages/pygtftk/cmd_manager.py", line 959, in run fun(**args) File "/amuhome/puthier/.local/lib/python3.7/site-packages/pygtftk/plugins/ologram.py", line 817, in ologram hits[feat_type] = overlap_partial(bedA=peak_file, bedsB=gtf_sub_bed, ft_type=feat_type) File "/amuhome/puthier/.local/lib/python3.7/site-packages/pygtftk/stats/intersect/overlap_stats_shuffling.py", line 383, in compute_overlap_stats Lrs_toappend, Lis_toappend, all_chrom2_toappend = read_bed.bed_to_lists_of_intervals(bedsB[k], chrom_len) File "pygtftk/stats/intersect/read_bed/read_bed_as_list.pyx", line 66, in pygtftk.stats.intersect.read_bed.read_bed_as_list.bed_to_lists_of_intervals File "<__array_function__ internals>", line 6, in unique File "/amuhome/puthier/.local/lib/python3.7/site-packages/numpy/lib/arraysetops.py", line 262, in unique ret = _unique1d(ar, return_index, return_inverse, return_counts) File "/amuhome/puthier/.local/lib/python3.7/site-packages/numpy/lib/arraysetops.py", line 323, in _unique1d ar.sort() TypeError: '<' not supported between instances of 'str' and 'int' |-- 10:03-DEBUG_MEM-ologram : GTF deleted (#l=154726080, f=Homo_sapiens.GRCh37.67.gtf.gz).

dputhier commented 3 years ago

@qferre Hi Quentin. This problem is observed when the chromosome is in ensembl format (i.e 1,2,3 instead of chr1, chr2,chr3). I think the issue lies somewhere in the OLOGRAM code.

dputhier commented 3 years ago

@qferre Hi Quentin, I wanted to use OLOGRAM with my students but still fall with the same error. It seems that OLOGRAM don't like ensembl format (i.e 1,2,3 instead of chr1, chr2,chr3). Got the same error than above. I think this is almost nothing to fix (could you push it in develop ?).

dputhier commented 3 years ago

Fixed.