Closed Maheen94 closed 2 years ago
Hi, it seems the file is not accessible. Can you reach the working folder indicated and check wether the link to the annotation file is working?
Luca
Hi,
So, I uploaded the ref genome (fasta) and cooresponding annotation file (GTF) from Gencode using following commands in anno folder:
wget http://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_38/GRCh38.p13.genome.fa.gz wget http://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_38/gencode.v38.chr_patch_hapl_scaff.annotation.gtf.gz
The alignment works fine but its specifically the counts function which throws an error. I also tried using other annotation files from Gencode but the same problem persisted.
Maheen
Mmm... Can you try unzipping the GTF file before? I'll double-check if this is a bug. Thanks for letting me know!
Sure, will try unzipping the file first. Will keep you updated.
Tried rerunning it with unzipped file but ran into the same error:
Also, it says "Error occurred when processing GFF file". I always use GTF format and didn't run into issues. I'm wondering if I should use GFF3 format instead?
Command exit status: 1
**Command output: (empty)
Command error: WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. Error occured when processing GFF file (line 1 of file /home/ubuntu/environment/master_of_pores/NanoPreprocess/../anno/gencode.v38.chr_patch_hapl_scaff.annotation.gtf): [Errno 2] No such file or directory: '/home/ubuntu/environment/master_of_pores/NanoPreprocess/../anno/gencode.v38.chr_patch_haplscaff.annotation.gtf' [Exception type: FileNotFoundError, raised in init.py:47]**
Well no. That process is using htseq-count that is able to read gtf files. So I don't understand why you have this error. But again I see a "file not found". So can you go to that temporary folder and try to see if the link is ok?
The link is actually ok. Just double checked it
Ok. So you can run that command just by doing
singularity exec -e NAMEOFIMAGE .command.sh
the name of the image can be retrieved by doing a
grep singu .command.run
Command 'singularity' not found, but can be installed with:
I'm using docker, I normally start my pipeline with the following command: nextflow run nanopreprocess.nf -with-docker
Aha. Well, maybe there is a problem with the mounting of volumes and docker. So grep docker inside the .command.run you will find the command to use.
Hi, So, I changed some parameters of params.config. Previous: ref_type "genome" (theoretically this makes sense since I'm using genome ref) Changed: ref_type "transcriptome" (It ran smoothly on this setting even though I used the same ref genome uploaded in the anno folder using the wget command)
Changing to transcriptome will change the tool for counting. So something weird is happening with htseq-count tool. Did you solve it?
I have the same error too, tried it with both gz and gunziped format. The command works well when I run it using the systems htseq but fails when using the singularity container.
The transcriptome mapping works
Ouch. Can you send me a single fast5 file for trying?
my email il luca.cozzuto /at/ crg . eu
Thanks for the super quick response!
I tried it with the test data provided as well and get the same error.
Best, Ashkan
On Aug 26, 2021, at 12:21 PM, Luca Cozzuto @.***> wrote:
my email il luca.cozzuto /at/ crg . eu
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi, the test dataset is transcriptome. So, it won't use htseq-count (that needs the GTF annotation). If you have a single fast5 file and tell me which genome and annotation is, I'll give it a try and debug
Hi For me the same error persisted. Only the transcriptiome mapping works.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Luca Cozzuto @.> Sent: Thursday, August 26, 2021 6:47:23 AM To: biocorecrg/master_of_pores @.> Cc: Batool, Syeda Maheen @.>; Author @.> Subject: Re: [biocorecrg/master_of_pores] Counts feature error during Nanopreprocess run (#97)
External Email - Use Caution
Hi, the test dataset is transcriptome. So, it won't use htseq-count (that needs the GTF annotation). If you have a single fast5 file and tell me which genome and annotation is, I'll give it a try and debug
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://secure-web.cisco.com/1AYafHv_lQqbSO5e82dZVqLfb5fT7elTOLi7RkvhWg2y6jDzATuxAyGWbIwj96JqscGsRUsBNjeoOsn-zekLePg31nPYCS2crID6lvWBQsw_lAtjZuYncW0dfQYkab5cJXUHjRhXrYJORn8I7QyXdrlnqvvpEvOcIl1qzJOeKbDBjhzjrLV7cDTABURixRV2LLAgLsAdT5i28Wg1W8Mq-r2Bz4jlkjERHlZboF7cWfhjfVAaEgGBgE9Wo517zaE0g/https%3A%2F%2Fgithub.com%2Fbiocorecrg%2Fmaster_of_pores%2Fissues%2F97%23issuecomment-906296630, or unsubscribehttps://secure-web.cisco.com/1Tgc_oKeTdyUn23T26GAFETSLS54A2eX5DRuaLhuZYsxQI03vIqJCijHxg0YeLZ-XTcORzw4SU7f_kIYtC8Kj9-8cgETy5G9GlSv4IRxunxO2k8TyllWGG1ojKZFXF7_g8JQYFp0lQlvcBigxZypvS4ULDLt_SMIZpT1UL3rRJ5muTRpUDJV3gh_Ss22Yn5VRo0ebsPJPfJL80ziQlNZJSrRBHPMSBAc_8QGh_3fkq-e5_7icGYw_YRKY83NdmF0P/https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAT3SK7HUMZ75GSKVPAWEXX3T6YLTXANCNFSM453NXRYQ. Triage notifications on the go with GitHub Mobile for iOShttps://secure-web.cisco.com/1KONwHQvypI6dxlh4tVQ-JZ5Lq24lzHkyda2b1pnGVc4aLLAgvYdYHXfqI7jTym7D8nEgGC9H9hg74zmJJvX9Ov2ou1w45RB7BeR5aDXnDwIZkvkyGGlaMHjn1tlgl_NI-G7ql01eez87AdFQF90HI0jjSd9oYXiWEl0QwzqIw7YWxne95eocn7dP6X8jOtDQ2ciso6TWa1AjNq1cbkV5g1UKog2qEuFlXX6FVZUcfiOLY8N4IM-HHQ-f2wx_vbFt/https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675 or Androidhttps://secure-web.cisco.com/1nOh9JOptUmnqAqddsHL_Qg_keAbsc4Rh3PlTL84gmYu9B2Cvja2E6ATcCpTLSssAm8SQet2kDjVgGYdPJxEbKiV-MvKOEO-_Yz0MlQrLjgXipsqsX2KShgdlMfcuwBAS-2mEmPc1VhxYJtFIx3Y6UQTYD-9fr-6DRxHp9kpGHFl5Oi8AoZ-US3r0OMoX9Xz_kFkhbQyhdLPIINop1qm-zjYkDhpM8qp5A7lTZxZwdKVwROu-O_D0CeTApO2z0fkq/https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26utm_campaign%3Dnotification-email. The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Mass General Brigham Compliance HelpLine at http://www.massgeneralbrigham.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. Please note that this e-mail is not secure (encrypted). If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately. Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail.
I sent you a mail
Hi, I replied to the mail. I was unable to reproduce the error. I have singularity version 3.2.1
and using this param
params { kit = "SQK-RNA001" flowcell = "FLO-MIN106" fast5 = "$baseDir/../data/input/*.fast5" reference = "$baseDir/../anno/GRCh38.primary_assembly.genome.fa.gz" annotation = "$baseDir/../anno/gencode.v38.annotation.gtf" ref_type = "genome"
seq_type = "RNA"
output = "$baseDir/output"
qualityqc = 5
granularity = ""
basecaller = "guppy"
basecaller_opt = ""
GPU = "OFF"
demultiplexing = ""
demultiplexing_opt = ""
demulti_fast5 = "OFF"
filter = ""
filter_opt = ""
mapper = "minimap2"
mapper_opt = ""
map_type = "spliced"
counter = "YES"
counter_opt = ""
variant_caller = "NO"
variant_opt = ""
downsampling = ""
email = ""
}
Hi, it looks like htseq has a problem with some long-read mappers. We fix this in the version of master of pores:
Hi, I'm using master of pores for RNA seq data analysis. I used reference and annotation files from Gencode. However, I get the following error consistently at the counts step. Would really appreciate any assistance. Thanks!!
Maheen