mhuttner / miRA

GNU General Public License v2.0
5 stars 1 forks source link

*** Error in `.': double free or corruption (!prev): 0x0000000003f720a0 *** #1

Open pranalis018 opened 8 years ago

pranalis018 commented 8 years ago

Hi All, I have been trying to use miRA for miRNA analysis and to find novel miRNAs in Ipomoea batatas. The command I am using is: ./miRA full -c configuration.config bam_files/SpR_Ctrl.sam Ref_genome.fa spr-ctrl/

I have attached my config file for any further references.

# Log Level
#0 = quiet
#1 = normal
#2 = verbose
log_level = 1

# Number of processor threads used for folding structures.
# Note: This might severely affect computations of other
# users on the same machine.
openmp_thread_count = 10

# During determination of expression clusters from the
# alignment data, two clusters are merged if they lie less 
# than cluster_gap_size (in nt) apart.
cluster_gap_size = 10

# Discard a cluster if it contains less than 
# cluster_min_reads reads.
cluster_min_reads = 10

# Length of flanking region (in nt) by which the main 
# expression cluster is extended at the 5' and 3' end.
cluster_flank_size = 200

# Maximum length (in nt) of a cluster (including flanking
# regions).
cluster_max_length = 2000

# Minimum length (in nt) of precursor.
# Ignored if min_precursor_length = 0.
min_precursor_length= 20

# Maximum length (in nt) of precursor.
# Ignored if max_precursor_length = 0.
max_precursor_length= 0

# Per-nucleotide minimum free energy (MFE/nt) of the 
# folded sequence. miRNA candidates must have 
# MFE/nt < max_mfe_per_nt.
max_mfe_per_nt = -0.2

# Maxmimum number of hairpins of the folded structure. 
# miRNA candidates must have a number of 
# hairpins < max_hairpin_count.
max_hairpin_count = 4

# Minimum length (in nt) of the double stranded segment 
# within the folded sequence. miRNA candidates must have 
# a double-stranded segment (allowing for one mismatch) 
# of length >= min_double_strand_length.
min_double_strand_length = 18

# Number of permutations for the calculation of the
# null distribution.
permutation_count = 100

# p-value cutoff for significance testing.
# Optimum structures must have a p-value smaller (<) 
# than max_pvalue.
max_pvalue = 0.01

# Threshold for Dicer-associated difference in coverage
# as fraction of total miRNA precursor coverage.
min_coverage = 0.01

# Minimum fraction of paired nucleotides in mature/star
# miRNA duplex
min_paired_fraction = 0.55

# Minimum length (inclusive, in nt) of mature/star miRNA.
min_duplex_length = 18

# Maximum length (exclusive, in nt) of mature/star miRNA.
max_duplex_length = 30

# Allow/disallow for 3 consecutive mismatches in the 
# mature miRNA. 
# Allow = 1.
allow_three_mismatches = 1

# Allow/disallow for 2 consecutive mismatches at the
# start/end of the miRNA duplex. 
# Allow = 1.
allow_two_terminal_mismatches= 1

# Setting to 1 creates gnuplot coverage plots.
# Note: Requires gnuplot.
create_coverage_plots = 0

# Setting to 1 creates structure plots.
# Note: Requires VARNA.
create_structure_plots = 1

# Setting to 1 creates structure coverage plots.
# Note: Requires VARNA.
create_structure_coverage_plots = 1

# Setting to 1 deletes auxiliary files.
# Auxiliary files including LaTeX intermediate files,
# eps figures, etc.
cleanup_auxiliary_files = 1

But I keep getting the error: * Error in `.': double free or corruption (!prev): 0x0000000003f720a0 *

Is it my file or command or anything else that I am doing wrong? Do I need to install anything wlese to get rid of this issue?

mevers commented 8 years ago

Hi.

The config file looks fine. If you've followed the instructions on the github page, you shouldn't need to install anything else. Can you check that the miRA example runs cleanly on your computer?

Maurits

PS. I've edited your original comment to prevent markdown from formatting the config file entries.

pranalis018 commented 8 years ago

Hi, Thanks for your quick response. Yes, I ran the sample file in on my machine. It runs completely fine:) That file does not show any such issue

mevers commented 8 years ago

Ok. There might be an issue with the file formatting of the SAM and/or FASTA files that you are using, which throws off miRA. Let's try and figure this out.

During which stage does the error occur? Does miRA generate contig clusters (stage 1)? Does miRA fail during the folding process (stage 2)? Or does the error occur when validating folded candidates?

It is important that chromosome names from the SAM file match those from the FASTA file. This should be the case if you used the same FASTA file during the alignment process. Which program did you use for the read alignment? The resulting SAM alignment file should not contain unmapped reads. Can you double-check that this is the case. You can remove unmapped reads from your SAM file with

samtools view -hS -F 4 all_reads.sam | samtools sort - > sorted_mapped_reads

Maurits

pranalis018 commented 8 years ago

Yes:) I have checked all the SAM files and they contain only mapped reads. Everything is working correctly upto report generation. I am working with 8 SAM files here. For some files it will produce the structure.ps folder but no coverage plots are produced and for 3 files it fails right after generating the fold_candidates file

mevers commented 8 years ago

So let's make sure that we're on the same page:

  1. miRA finishes stage 1 (generating cluster contigs) without errors.
  2. miRA finishes stage 2 (folding) without errors. Is that correct?

I don't quite understand what you mean by "working with 8 SAM files". Have you merged the SAM files and run miRA on the resulting merged SAM? Or have you been running miRA separately on the 8 files? For miRA identification I would suggest merging the SAM files, and running miRA on the single merged SAM file. You can then later go and quantify changes in expression from identified miRNA candidates.

One more thing: Can you check whether the error occurs always for the same candidate? If so, it would suggest a problem with a particular entry in the list of candidate. If not, it might point towards a memory problem.

Maurits

pranalis018 commented 8 years ago

Hi Maurits, Thanks for your patience and help. So I have been running miRA on 8 different SAM files. and the exact console output I get is: output.txt

mevers commented 8 years ago

Hi.

Just to let you know. I've asked my colleague Michael to get involved in this, as I remember a potentially similar "double free or corruption" issue he dealt with in the past. If I remember correctly, there was an issue with (in)correctly parsing the SAM file during step 3. Unfortunately SAM file format specifications are not very standardised across different aligners.

Maurits

mhuttner commented 8 years ago

Hi, im looking the the issue right now, but cannot pinpoint it in the code. Would you mind running miRA again, with the output set to verbose with the -v flag in the command or setting log_level = 2 in the config file, and posting the output again. As Maurits said we dealt with similar issue previously, which was caused by different compiler behavior between the LLVM compiler and gcc. If you need an immediate solution switching to the LLVM compiler might fix the issue for you, but of course i will try to fix the misbehaving code as soon as possible.

Michael

pranalis018 commented 8 years ago

Hi Michael, I did as you asked and changed the value for log_level in the config file. But to my dismay the error still persists. I have attached the console output for you, please check and let me know the issue. output.txt

mhuttner commented 8 years ago

Is the data you are using publicly available? Would it be possible that you share it with me so i can try to reproduce your issue on my machine? The logs show weird behavior with errors showing up at different places.

pranalis018 commented 8 years ago

Hi, I have attached the whole directory with SAM file, reference genome and a configuration file for you to try out. I am currently using a RHEL 7 operating system. Let me know if I need to make any system changes. I have more 8 such SAM files which need to be analysed urgently.

Thank you for your time and help:) sample.zip

mhuttner commented 8 years ago

Thank for the example data, i can reproduce the bug on my machine aswell. It seems to be a heap corruption problem in the code. Bad news is it is very hard to pinpoint the issue, since the programm only crashes when and if the corruption is touched by other (valid) code, not when the corruption occurs. Im still trying to find the source of the issue.

Good news is that the program will work in cases where the corruption does not get touched again, so if your data is small enough a temporary solution would be to rerun the programm until it works. Of course this is not an acceptable solution, but unfortuately i cannot provide a better one in the short term. I managed to do this on my system after a few tries and im attaching the results as they may be useful to you. results.zip

mhuttner commented 8 years ago

I think i managed to link this issue to the logging code, which is not needed for miRA to function so i temporarily disabled it. Could you try to download the latest version of miRA miRA-1.1.4.tar.gz and see if this fixes your issue.

Michael

pranalis018 commented 8 years ago

Hi Michael, I will check if new code you provided works for my files. Just for confirmation, there is nothing wrong with the SAM file is there? Thanks

pranalis018 commented 8 years ago

Hi, I downloaded the file and it configured well. But the make command is giving the following error: src/vfold.c:245:46: error: ‘buffers’ undeclared (first use in this function) buf) shared(buffers) schedule(dynamic) ^ src/vfold.c:245:46: note: each undeclared identifier is reported only once for each function it appears in src/vfold.c: In function ‘write_foldable_sequence’: src/vfold.c:531:11: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘u64’ [-Wformat=](c->strand ==) ? "minus" : "plus"); ^ make[1]: * [src/miRA-vfold.o] Error 1 make[1]: Leaving directory `/home/pranali/miRNA_analysis/miRA-1.1.4' make: * [all] Error 2

??

mhuttner commented 8 years ago

Yes, the SAM File is completely fine. miRA only ignores the first line in your file @HD VN:1.3 SO:coordinate but that is expected.

Sorry, i will fix the compile error right away.

mhuttner commented 8 years ago

Should work now, sorry i wanted to deploy fast and did not test compiling on linux.

pranalis018 commented 8 years ago

Hi Michael, Unfortunately, the error still persists.

Wed May 18 16:43:20 2016 --- Folding completed successfully. Wed May 18 16:43:20 2016 --- Coverage based verification... Wed May 18 16:43:22 2016 --- Coverage based verification completed Wed May 18 16:43:22 2016 --- Generating reports... Wed May 18 16:43:22 2016 --- Generating report for cluster 66... Wed May 18 16:43:35 2016 --- Generating report for cluster 85... Wed May 18 16:43:35 2016 --- Generating reports completed * Error in `.': double free or corruption (!prev): 0x00000000047697c0 * Aborted (core dumped)

mhuttner commented 8 years ago

Based on the log this seems to be a different error, and it only occured after miRA was done. Did you get output reports?

pranalis018 commented 8 years ago

Hi Yes I did get the reports but no coverage plots. And for some files no reports were generated at all

mhuttner commented 8 years ago

I just pushed a new update to miRA 9b617e41e6073f6e5048c9e11f47ee7c74d936da. You might want to try your analysis with the miRA batch command instead. I also enabled additional logging for report generation, which might help you to see why you do not get your coverage plots.

Michael

pranalis018 commented 8 years ago

Hi Michael, I will look into it and let you know once I try it out:) Thanks, Pranali