Closed ksahlin closed 4 years ago
oh, and this is the command line:
graphmap2 align --index index.gmidx --gtf NA12878.gtf --threads 62 -x rnaseq -r Homo_sapiens.GRCh38.dna.primary_assembly.fa -d ont_human.fq -o reads.sam
some of the log:
[09:08:03 BuildIndexes] Loading GTF annotations.
[09:08:51 BuildIndexes] Loading genomic sequences.
[09:09:22 BuildIndexes] Generating the transcriptome.
[09:09:22 GenerateTranscriptomeSeqs] Constructing the transcriptome sequences.
[09:09:22 GenerateTranscriptomeSeqs] In total, there are 67 transcripts.
[09:09:22 SetupIndex_] Loading index from file: '/nfs/brubeck.bx.psu.edu/scratch4/ksahlin/ultra_eval/alignments/graphmap2/ont_human/index.gmidx'.
[09:09:22 Index] Memory consumption: [currentRSS = 166 MB, peakRSS = 3352 MB]
[09:09:22 Run] Hits will be thresholded at the percentil value (percentil: 99.000000%, frequency: 6).
[09:09:22 Run] Minimizers will be used. Minimizer window length: 5
[09:09:22 Run] Reference genome is assumed to be linear.
[09:09:22 Run] One or more similarly good alignments will be output per mapped read. Will be marked secondary.
[09:09:23 ProcessReads] All reads will be loaded in memory.
[09:09:36 ProcessReads] All reads loaded in 11.59 sec (size around 1490 MB). (722592014 bases)
[09:09:36 ProcessReads] Memory consumption: [currentRSS = 1954 MB, peakRSS = 3352 MB]
...here is the progressbar and the error happens...
Command terminated by signal 6
@ksahlin Hi, could you tell me where can I download the reference and gtf file you are using. I can see it is the 38 human chromosome, but I would like to use the same files to run it on my side.
Of course,
the genome is from ensemble release 98 from ftp://ftp.ensembl.org/pub/release-98/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
The GTF annotation is ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/gencode.v32.chr_patch_hapl_scaff.annotation.gtf.gz
Thanks!
To further troubleshoot:
I get the same error when running without the GTF file as
graphmap2 align --index index.gmidx --threads 32 -x rnaseq -r Homo_sapiens.GRCh38.dna.primary_assembly.fa -d ont_human.fq -o reads.sam
I may also add that I'm running graphmap2 through a snakemake pipeline, which executes 2 jobs of graphmap2 at the same time (on two separate nodes though). Are there any temporary files created in, e.g., the execution directory that may interfere with each other?
Best, Kristoffer
I have probably caused this error because the chromosome labels in my GTF didn't match the labels in the fasta (chr1 --> 1,..., MT --> chrM).
I fixed this and it seems to be working fine. Maybe a small request would be a more informative error message, but it was ultimately an error caused by the user :) I will get back to you if I run into any other problems.
Ok. I was able to see where the error was, but I didn't know what caused it. I will put more clear message at that place. We will release a new version with more precise alignments in a few days.
Hi Ivan,
I get the following error when running graphmap2 on both my ONT cDNA datasets (one dataset can be found here (from this webpage: https://github.com/nanopore-wgs-consortium/NA12878/blob/master/nanopore-human-transcriptome/fastq_fast5_bulk.md)
Any ideas what this might be?
Best, Kristoffer