smithlabcode / falco

A C++ drop-in replacement of FastQC to assess the quality of sequence read data
https://falco.readthedocs.io
GNU General Public License v3.0
90 stars 10 forks source link

malloc with nanopore FASTQ file #44

Closed Midnighter closed 3 months ago

Midnighter commented 1 year ago

Hi there,

I'm running falco on some nanopore sequencing data and on one out of 45 FASTQ files, I hit the following error:

[limits]        using file /usr/local/opt/falco/Configuration/limits.txt
[adapters]      using file /usr/local/opt/falco/Configuration/adapter_list.txt
[contaminants]  using file /usr/local/opt/falco/Configuration/contaminant_list.txt
[Thu Jan 19 13:04:51 2023] Started reading file x.fastq.gz
[Thu Jan 19 13:04:51 2023] reading file as gzipped FASTQ format
[running falco|                                                   |  0%]malloc(): unsorted double linked list corrupted
31/cf81b3880a6686b11bd6f7b4f43575/.command.sh: line 2:    29 Aborted                 (core dumped) falco --threads 1 x.fastq.gz -D x_raw_falco_data.txt -S x_raw_falco_summary.txt -R x_raw_falco_report.html

I'm afraid, I can't share the FASTQ file with you. If I can somehow investigate further, please let me know.

I'm running falco from Docker quay.io/biocontainers/falco:1.2.1--h867801b_3.

The command used was:

falco  --threads 1 x.fastq.gz -D x_raw_falco_data.txt -S x_raw_falco_summary.txt -R x_raw_falco_report.html
andrewdavidsmith commented 1 year ago

Can you try a smaller file, to reproduce the error with a simpler case? Maybe run head and if you get something small enough that reproduces the error, maybe just telling us something about the lengths of the sequences would help. @guilhermesena1 would know more, but this seems like an older issue that would have been fixed in 1.2.1.

nick-youngblut commented 3 months ago

I'm getting the same error:

[Fri Mar 15 02:17:29 2024] Started reading file B16_Bl6_WT_lung_2.fastq.gz
[Fri Mar 15 02:17:29 2024] reading file as gzipped FASTQ format
[running falco|                                                   |  0%]malloc(): unsorted double linked list corrupted

The fastq file: B16_Bl6_WT_lung_2.fastq.gz

I'm using quay.io/biocontainers/falco:1.2.1--hd36ca80_4.

My command:

falco B16_Bl6_WT_lung_2.fastq.gz   \
  -D B16_Bl6_WT_lung_2/fastqc_data.txt  \
  -R B16_Bl6_WT_lung_2/fastqc_report.html \
  -S B16_Bl6_WT_lung_2/summary.txt

I should note that falco does work on that same sample (Nanopore data) if I am less strict with the quality filtering; in other words, falco completes successfully if the fastq includes more reads.

andrewdavidsmith commented 3 months ago

Thanks @nick-youngblut I can reproduce. Works with latest source, so I'll see about getting that container updated.

andrewdavidsmith commented 3 months ago

@nick-youngblut I made a new release that works with your example (on my machines) and I updated the package in conda. I think the container on quay.io has been auto-generated from bioconda, so I'm not sure how long it will take for that to propagate. If you can try the version from conda and confirm that it works for you, I'll close the issue. Otherwise, I'll try something else.

andrewdavidsmith commented 3 months ago

The container for falco v1.2.2 became available on quay.io very quickly, and I was able to run a test with the data provided above and it seemed to work -- no malloc issue. I'm closing this issue. Thanks @nick-youngblut and @Midnighter for raising this issue. If there are further problems please open another issue, and hopefully I can get to these more quickly.