getzlab / rnaseqc

Fast, efficient RNA-Seq metrics for quality control and process optimization
Other
150 stars 19 forks source link

Collapse.py file #47

Closed jjyotikataria closed 2 years ago

jjyotikataria commented 4 years ago

@agraubert @joshua-gould @francois-a @dmcgoldrick Hi, can you please guide how to convert the full length annotation gtf to the input gtf for rnaseqc. I have been using schmidtea_mediterranea.PRJNA379262.WBPS14.canonical_geneset.rnaseqc.gtf downloaded from here.

ftp://ftp.ebi.ac.uk/pub/databases/wormbase/parasite/releases/WBPS14/species/schmidtea_mediterranea/PRJNA379262/schmidtea_mediterranea.PRJNA379262.WBPS14.canonical_geneset.gtf.gz

I used collapse.py which produced this error:

python collapse_annotation.py schmidtea_mediterranea.PRJNA379262.WBPS14.canonical_geneset.rnaseqc_type.gtf new.gtf File "collapse_annotation.py", line 100 print('Parsing GTF: {0:d} genes processed\r'.format(len(self.genes)), end='\r') ^ SyntaxError: invalid syntax

And while running rnaseqc command, I'm facing issue of duplicate exon ids.

Failed to parse the GTF: Detected non-unique Exon ID: SMEST026639001.e4

agraubert commented 3 years ago

Hi @jjyotikataria, so sorry for the delayed response. At first glance, it seems like you're using python 2. Could you try running the collapse_annotation.py script using python3.5 or greater?

Anto007 commented 3 years ago

Hi, Related to this issue, I ran collapse_annotation.py using python 3.6.2 and got the below error. Any help would be appreciated. My end goal is to make rnaseqc work which core-dumps with my existing gtf file (that was downloaded from NCBI)

python3 collapse_annotation.py GCF_003668045.3_CriGri-PICRH-1.0_genomic.gtf GCF_003668045.3_CriGri-PICRH-1.0_genomic_collapsed.gtf

Traceback (most recent call last): File "collapse_annotation.py", line 279, in annotation = Annotation(args.transcript_gtf) File "collapse_annotation.py", line 67, in init attributes[kv[0]] = kv[1] IndexError: list index out of range

jjyotikataria commented 3 years ago

Hi if you are okay we can look into it together on google meet or Skype?

On Tue, Mar 16, 2021, 12:46 PM Anto007 @.***> wrote:

Hi, Related to this issue, I ran collapse_annotation.py using python 3.6.2 and got the below error. Any help would be appreciated. My end goal is to make rnaseqc work which core-dumps with my existing gtf file (that was downloaded from NCBI)

python3 collapse_annotation.py GCF_003668045.3_CriGri-PICRH-1.0_genomic.gtf GCF_003668045.3_CriGri-PICRH-1.0_genomic_collapsed.gtf

Traceback (most recent call last): File "collapse_annotation.py", line 279, in annotation = Annotation(args.transcript_gtf) File "collapse_annotation.py", line 67, in init attributes[kv[0]] = kv[1] IndexError: list index out of range

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/getzlab/rnaseqc/issues/47#issuecomment-800016867, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOAMGNF3MWFHOSF7VRJWJ4LTD4AUXANCNFSM4S232GHQ .

jjyotikataria commented 3 years ago

It worked for me.

On Tue, Mar 16, 2021, 3:09 PM Jyoti Kataria @.***> wrote:

Hi if you are okay we can look into it together on google meet or Skype?

On Tue, Mar 16, 2021, 12:46 PM Anto007 @.***> wrote:

Hi, Related to this issue, I ran collapse_annotation.py using python 3.6.2 and got the below error. Any help would be appreciated. My end goal is to make rnaseqc work which core-dumps with my existing gtf file (that was downloaded from NCBI)

python3 collapse_annotation.py GCF_003668045.3_CriGri-PICRH-1.0_genomic.gtf GCF_003668045.3_CriGri-PICRH-1.0_genomic_collapsed.gtf

Traceback (most recent call last): File "collapse_annotation.py", line 279, in annotation = Annotation(args.transcript_gtf) File "collapse_annotation.py", line 67, in init attributes[kv[0]] = kv[1] IndexError: list index out of range

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/getzlab/rnaseqc/issues/47#issuecomment-800016867, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOAMGNF3MWFHOSF7VRJWJ4LTD4AUXANCNFSM4S232GHQ .

Anto007 commented 3 years ago

Thank you for your quick response Jyoti- much appreciated! When you say it worked for you, do you mean to say you tried running "collapse_annotation.py" on the gtf file downloaded from here: https://www.ncbi.nlm.nih.gov/assembly/GCF_003668045.3/ ?? This is the exact gtf file that I'm interested in collapsing. Thanks again!

jjyotikataria commented 3 years ago

I'll check the link by night. I am away from laptop at the moment. But before collapsing the gtf, we need to meet one more criteria as far as I remember. I'll share you that link as well.

On Tue, Mar 16, 2021, 4:02 PM Anto007 @.***> wrote:

Thank you for your quick response Jyoti- much appreciated! When you say it worked for you, do you mean to say you tried running "collapse_annotation.py" on the gtf file downloaded from here: https://www.ncbi.nlm.nih.gov/assembly/GCF_003668045.3/ ?? This is the exact gtf file that I'm interested in collapsing. Thanks again!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/getzlab/rnaseqc/issues/47#issuecomment-800143508, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOAMGNGCHKFWLZVOEJ67WN3TD4XSPANCNFSM4S232GHQ .