Magdoll / cDNA_Cupcake

Miscellaneous collection of Python and R scripts for processing Iso-Seq data
BSD 3-Clause Clear License
257 stars 104 forks source link

Error: Unrecognized ID format in get_abundance_post_collapse.py #160

Open Adamtaranto opened 3 years ago

Adamtaranto commented 3 years ago

Hi Liz,

I'm getting an error when I run get_abundance_post_collapse.py on a set of collapsed flnc isoforms.

The read names in unpolished.cluster_report.csv seem to match those in flnc.collapsed.group.txt (see examples below). I can't figure out why they are not being parsed correctly.

Any help would be much appreciated. Thanks.

> get_abundance_post_collapse.py flnc.collapsed unpolished.cluster_report.csv

Traceback (most recent call last):

File "/.conda/envs/annotation/bin/get_abundance_post_collapse.py", line 278, in <module>
    get_abundance_post_collapse(args.collapse_prefix, args.cluster_report, args.collapse_prefix)

File "/.conda/envs/annotation/bin/get_abundance_post_collapse.py", line 263, in get_abundance_post_collapse
    cid_info = read_group_filename(collapse_prefix + ".group.txt", is_cid=True)

File "/.conda/envs/annotation/bin/get_abundance_post_collapse.py", line 117, in read_group_filename
    raise Exception("Unrecognized id format {0} in {1}!".format(cid, group_filename))

Exception: Unrecognized id format m64069_210422_153915 in flnc.collapsed.group.txt!

Example record from "flnc.collapsed.group.txt"

PB.1.1 m64069_210422_153915/126617003/ccs,m64069_210422_153915/112066612/ccs,m64069_210422_153915/96733715/ccs,m64069_210422_153915/92998104/ccs,m64069_210422_153915/87097780/ccs,m64069_210422_153915/63112841/ccs,m64069_210422_153915/122357892/ccs,m64069_210422_153915/141232097/ccs,m64069_210422_153915/166985963/ccs,m64069_210422_153915/67634755/ccs,m64069_210422_153915/169019127/ccs,m64069_210422_153915/174129561/ccs,m64069_210422_153915/105775521/ccs,m64069_210422_153915/10749171/ccs,m64069_210422_153915/108527833/ccs,m64069_210422_153915/109576597/ccs,m64069_210422_153915/110822944/ccs,m64069_210422_153915/12322813/ccs,m64069_210422_153915/126027585/ccs,m64069_210422_153915/127140487/ccs,m64069_210422_153915/135137015/ccs,m64069_210422_153915/136513370/ccs,m64069_210422_153915/137103866/ccs,m64069_210422_153915/138939120/ccs,m64069_210422_153915/14354283/ccs,m64069_210422_153915/153093701/ccs,m64069_210422_153915/154272731/ccs,m64069_210422_153915/156568168/ccs,m64069_210422_153915/158665503/ccs,m64069_210422_153915/170199322/ccs,m64069_210422_153915/170263994/ccs,m64069_210422_153915/171115723/ccs,m64069_210422_153915/177406510/ccs,m64069_210422_153915/23790628/ccs,m64069_210422_153915/25232106/ccs,m64069_210422_153915/25692143/ccs,m64069_210422_153915/29951839/ccs,m64069_210422_153915/35455420/ccs,m64069_210422_153915/3868653/ccs,m64069_210422_153915/40110844/ccs,m64069_210422_153915/43386031/ccs,m64069_210422_153915/51513536/ccs,m64069_210422_153915/65863865/ccs,m64069_210422_153915/66192895/ccs,m64069_210422_153915/69271999/ccs,m64069_210422_153915/73334886/ccs,m64069_210422_153915/76873883/ccs,m64069_210422_153915/86116292/ccs,m64069_210422_153915/90440196/ccs,m64069_210422_153915/90506655/ccs,m64069_210422_153915/96471166/ccs,m64069_210422_153915/98370473/ccs,m64069_210422_153915/122028220/ccs,m64069_210422_153915/16451289/ccs,m64069_210422_153915/164758043/ccs,m64069_210422_153915/9306714/ccs,m64069_210422_153915/33686786/ccs,m64069_210422_153915/146145536/ccs,m64069_210422_153915/99811548/ccs,m64069_210422_153915/179503162/ccs,m64069_210422_153915/29163649/ccs,m64069_210422_153915/9832728/ccs,m64069_210422_153915/77859643/ccs,m64069_210422_153915/99027360/ccs,m64069_210422_153915/133104384/ccs,m64069_210422_153915/102435628/ccs,m64069_210422_153915/103877404/ccs,m64069_210422_153915/172294956/ccs,m64069_210422_153915/61147007/ccs,m64069_210422_153915/164037296/ccs,m64069_210422_153915/133827120/ccs,m64069_210422_153915/94308285/ccs,m64069_210422_153915/121242734/ccs,m64069_210422_153915/150077837/ccs,m64069_210422_153915/8848219/ccs,m64069_210422_153915/46990956/ccs,m64069_210422_153915/138806903/ccs,m64069_210422_153915/3015791/ccs,m64069_210422_153915/44761405/ccs,m64069_210422_153915/105972474/ccs,m64069_210422_153915/74254030/ccs,m64069_210422_153915/137955093/ccs,m64069_210422_153915/180290687/ccs,m64069_210422_153915/155780577/ccs,m64069_210422_153915/136446049/ccs,m64069_210422_153915/143654922/ccs,m64069_210422_153915/161022461/ccs,m64069_210422_153915/27001801/ccs,m64069_210422_153915/21495811/ccs,m64069_210422_153915/174982241/ccs,m64069_210422_153915/175505963/ccs

Example records from "unpolished.cluster_report.csv"

cluster_id,read_id,read_type
transcript/0,m64069_210422_153915/123339638/ccs,FL
transcript/0,m64069_210422_153915/134678389/ccs,FL
transcript/1,m64069_210422_153915/171705954/ccs,FL
transcript/1,m64069_210422_153915/150667514/ccs,FL
transcript/1,m64069_210422_153915/63638362/ccs,FL
transcript/2,m64069_210422_153915/76285375/ccs,FL
transcript/2,m64069_210422_153915/84675058/ccs,FL
transcript/3,m64069_210422_153915/122355744/ccs,FL
transcript/3,m64069_210422_153915/12256305/ccs,FL
Magdoll commented 3 years ago

Hi @Adamtaranto what version of Cupcake are you using? can you upgrade to the latest one to confirm it's not an old version issue...

Adamtaranto commented 3 years ago

I was using 22.0.0 which is the most recent version on bioconda. I'll try the master branch version.