Open jessicarowell opened 2 years ago
Is centrifuge no longer actively maintained?
Sorry, I may miss the notification of this issue. Centrifuge utilized the FM index implementation from Bowtie2, so some of the error messages are inherited. The memory usually depends on the index file, and I think the HPVC database should be around 20GB. Can you check what the size of the HPVC files on your system and maybe try Centrifuge with more memory? Thanks.
I see. Thank you! My expanded hpvc database is 74GB. I was able to successfully run Centrifuge on a set of paired-end reads (each gzipped fastq 2GB) from a metagenomics sample on an EC2 instance with 128GB RAM. I haven't tried anything less than that yet.
I appreciate your time! In August I had created another issue about the metrics file; I'm still very interested in being able to output that file so if you have any time to address that one it would be really great! Thank you again. The software is cool and I also appreciate your paper explaining it; it's very helpful.
~ Jessica
On Wed, Nov 10, 2021 at 7:55 PM Li Song @.***> wrote:
Sorry, I may miss the notification of this issue. Centrifuge utilized the FM index implementation from Bowtie2, so some of the error messages are inherited. The memory usually depends on the index file, and I think the HPVC database should be around 20GB. Can you check what the size of the HPVC files on your system and maybe try Centrifuge with more memory? Thanks.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DaehwanKimLab/centrifuge/issues/220#issuecomment-965896206, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGBWOQXYDKXEFOHS73AUSMTULMIBXANCNFSM5EA6QBTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- Jessica Rowell
Then you need more than 74GB memory to run Centrifuge and 128GB allocation is good.
I tried to fix the metrics file issue a few years back but could not find the bug. I'll check it again.
I run the command below on an r5a.xlarge AWS EC2 instance (4 vCPUs, 32 GB RAM) on two small fastq files (about 5MB each). I've pasted the full error below the command. I'm using the
hpvc
reference index from your website. I tried this with-p 4
and-p 2
and I get the same error. I also see in the error that the-p
option is not reproduced and I am not 100% sure why.I can't find what Bowtie2 is being used for in the Centrifuge tool. Can you explain it? And do you have any estimates of how much memory Bowtie2 needs (I assume it scales with input size)?
centrifuge -q -t --met-file classify/metrics.txt -x $HOME/ref/centrifuge/hpvc -1 R1_001.fastq.gz -2 R2_001.fastq.gz --report-file classify/c_report.tsv -S classify/c_result.out
Thank you!