ParkinsonLab / Metatranscriptome-Workshop

Metatranscriptomics Tutorial
54 stars 22 forks source link

MemoryError #10

Closed kaiwang12 closed 5 years ago

kaiwang12 commented 5 years ago

Hello, I got a MemoryEerror when running 6_BWA_Gene_Map.py.

I would be appreciated if any help provided.

Kai

billytaj commented 5 years ago

without an error log or anything to go by, i can't help too much. It's likely your computer doesn't have enough RAM to load your dataset or one of the libraries.

On Wed, Mar 27, 2019, 20:16 kaiwang12, notifications@github.com wrote:

Hello, I got a MemoryEerror when running 6_BWA_Gene_Map.py.

I would be appreciated if any help provided.

Kai

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ParkinsonLab/Metatranscriptome-Workshop/issues/10, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3kzo6eenQIUT0gd8EWaLLmLBYsVC-cks5vbAnfgaJpZM4cPBe4 .

kaiwang12 commented 5 years ago

without an error log or anything to go by, i can't help too much. It's likely your computer doesn't have enough RAM to load your dataset or one of the libraries.

Hi, Thank you for your quick reply. My server has 128 Gb RAM.

Here is the error log information: Traceback (most recent call last): File "./6_BWA_Gene_Map.py", line 117, in gene_map(BWA_sam_file, unmapped_reads) File "./6_BWA_Gene_Map.py", line 79, in gene_map gene2read_map[db_match].append(read) MemoryError

Kai

billytaj commented 5 years ago

did all the other steps work with error?

On Wed, Mar 27, 2019, 20:30 kaiwang12, notifications@github.com wrote:

without an error log or anything to go by, i can't help too much. It's likely your computer doesn't have enough RAM to load your dataset or one of the libraries. … <#m-2552072433929179983>

Hi, Thank you for your quick reply. My server has 128 Gb RAM.

Here is the error log information: Traceback (most recent call last): File "./6_BWA_Gene_Map.py", line 117, in gene_map(BWA_sam_file, unmapped_reads) File "./6_BWA_Gene_Map.py", line 79, in gene_map gene2read_map[db_match].append(read) MemoryError

Kai

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ParkinsonLab/Metatranscriptome-Workshop/issues/10#issuecomment-477397260, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3kzgKkmhrcu2AAd6ZfdwYqXnAaCGKkks5vbA0zgaJpZM4cPBe4 .

kaiwang12 commented 5 years ago

did all the other steps work with error?

All the other steps work without error.

Kai

billytaj commented 5 years ago

then i have no idea. memoryerror in python is one of 2 things: either you have no more free memory on your machine, or one of the data input files is wrong.

can you absolutely gaurantee no errors whatsoever happened along the way at every step? (show all the output logs of every step)

can you guarantee that your server isn't somehow leaking memory or is being occupied by someone else at runtime? (ps au)

how big is your dataset? are there funny artifacts in your data?

On Wed, Mar 27, 2019, 21:03 kaiwang12, notifications@github.com wrote:

did all the other steps work with error? … <#m_-4310840611835664317m-1016806445180874529_>

All the other steps work without error.

Kai

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ParkinsonLab/Metatranscriptome-Workshop/issues/10#issuecomment-477403217, or mute the thrd https://github.com/notifications/unsubscribe-auth/AA3kzg7H8kw3bqlS5DjblDRK8yK8vzNRks5vbBTugaJpZM4cPBe4 .

kaiwang12 commented 5 years ago

then i have no idea. memoryerror in python is one of 2 things: either you have no more free memory on your machine, or one of the data input files is wrong. can you absolutely gaurantee no errors whatsoever happened along the way at every step? (show all the output logs of every step) can you guarantee that your server isn't somehow leaking memory or is being occupied by someone else at runtime? (ps au) how big is your dataset? are there funny artifacts in your data?

Actually, I'm using your test dataset (mouse1.fastq).

1) I can guarantee that I‘m the only user of my server.

2) Here is the code I used: ./6_BWA_Gene_Map.py microbial_all_cds.fasta mouse1_contigs_map.tsv mouse1_genes_map.tsv mouse1_genes.fasta mouse1_contigs.fasta mouse1_contigs_annotation_bwa.sam mouse1_contigs_unmapped.fastq mouse1_unassembled.fastq mouse1_unassembled_annotation_bwa.sam mouse1_unassembled_unmapped.fasta

3) I got no errors when I was running the previous steps, but I'll double the output logs.

Thanks.

kaiwang12 commented 5 years ago

then i have no idea. memoryerror in python is one of 2 things: either you have no more free memory on your machine, or one of the data input files is wrong. can you absolutely gaurantee no errors whatsoever happened along the way at every step? (show all the output logs of every step) can you guarantee that your server isn't somehow leaking memory or is being occupied by someone else at runtime? (ps au) how big is your dataset? are there funny artifacts in your data?

Hey,

Thanks for your reply. I think I figured out why I got MemoryError.

I used Trinity instead of SPAdes to assemble reads. The description line for Trinity contigs has three columns which are separated by space.

I used the following script to re-format the description line, and then everything works fine. awk '{ if (NR%2==1) { print $1 } else { print } }' mouse1_contigs_trinity.fasta > mouse1_contigs.fasta

Thanks.

Kai

billytaj commented 5 years ago

that would mean you didn't follow the instructions as they were written.

rule of thumb with pipelines: they're extremely inflexible, and are really only meant for the tools they were made for. we can't possibly accomodate for every tool, version, or flavour of every bioinformatic analysis tool out there.