kescull / immunopeptidogenomics

Tools for harnessing RNA-seq data to discover cryptic peptides in the immunopeptidome by mass spectrometry
MIT License
2 stars 1 forks source link

squish SIGBUS error #2

Open funnell opened 1 month ago

funnell commented 1 month ago

Hello,

I'm getting this error:

fish: Job 1, 'tools/immunopeptidogenomics/squ…' terminated by signal SIGBUS (Misaligned address error)

When running squish like so:

tools/immunopeptidogenomics/squish  \
    -d results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_transcriptome_3translate.fasta  \
    -d results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_indel_transcriptome_3translate.fasta \
    -d results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_unmasked_transcriptome_3translate.fasta \
    -o C6_unt_1_S1_full_cryptic.fasta \
    -t 1

This is the full output:

Number of threads specified: 1
input file 1: results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_transcriptome_3translate.fasta
input file 2: results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_indel_transcriptome_3translate.fasta
input file 3: results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_unmasked_transcriptome_3translate.fasta
2106787 entries in results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_transcriptome_3translate.fasta
1663158 entries left in results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_transcriptome_3translate.fasta after remove_duplicates()
1516643 entries in results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_indel_transcriptome_3translate.fasta
1229868 entries left in results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_indel_transcriptome_3translate.fasta after remove_duplicates()
After merge, 1993295 entries stored from 2 files
1526365 entries in results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_unmasked_transcriptome_3translate.fasta
1246990 entries left in results/ipg_db/dbs/C6_unt_1_S1/C6_unt_1_S1_unmasked_transcriptome_3translate.fasta after remove_duplicates()
After merge, 2113170 entries stored from 3 files
hash_len = 82128519
creating hash table...
hashing sequence 1 of 2113170
hashing sequence 100001 of 2113170
hashing sequence 200001 of 2113170
hashing sequence 300001 of 2113170
hashing sequence 400001 of 2113170
hashing sequence 500001 of 2113170
hashing sequence 600001 of 2113170
hashing sequence 700001 of 2113170
hashing sequence 800001 of 2113170
hashing sequence 900001 of 2113170
hashing sequence 1000001 of 2113170
hashing sequence 1100001 of 2113170
hashing sequence 1200001 of 2113170
hashing sequence 1300001 of 2113170
hashing sequence 1400001 of 2113170
hashing sequence 1500001 of 2113170
hashing sequence 1600001 of 2113170
hashing sequence 1700001 of 2113170
hashing sequence 1800001 of 2113170
hashing sequence 1900001 of 2113170
hashing sequence 2000001 of 2113170
hashing sequence 2100001 of 2113170
Thread 0 up and running
Thread 0 searching for seq 1
fish: Job 1, 'tools/immunopeptidogenomics/squ…' terminated by signal SIGBUS (Misaligned address error)

squish was compiled like this on my M3 MacBook Air:

gcc-14 squish.c -lm -lpthread -o squish

where:

gcc-14 --version
gcc-14 (Homebrew GCC 14.1.0_1) 14.1.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
funnell commented 1 month ago

The only modification I made to the program was that I needed to remove this line:

#include <malloc.h>

I had to do this to get any of the programs to compile, but this didn't seem to affect any other program as they all seemed to run fine after modification.

kescull commented 1 month ago

Hi Tyler, Sorry to hear it's not working for you. I'm glad any of it works on Mac though, since I hadn't tested that! Also, thanks for the tip about malloc.h, it looks like that's actually deprecated so I will be deleting that and re-checking all my programs as soon as I have time. Regarding the error, it could be Mac issues - if you can try it on a Linux that would be useful; it may be something as simple as incompatible text formats for newlines. I really don't know much about Mac. Another thing could be that it's struggling for memory; squish can use a lot (I think it would give a different error in that case but it's worth checking). Again, I don't know how to monitor that on Mac but I guess you do. Or very likely it's a bug in my code, mis-allocating memory. One quick test would be to try it with >1 thread - I always run it with multiple threads so it's possible I forgot to test with 1, and made some dumb mistake! Otherwise, the problem is it's been working for me so I'd need to try it with your data to find the problem. Would you be happy to share your files with me? We could make contact off github to transfer files privately if the data is confidential. Regards, Kate

funnell commented 1 month ago

Hi Kate, Thanks so much for the quick response! Running with multiple threads didn't work though unfortunately :'( and I don't see any issues with running out of memory with the input I'm using. I will give it a try on Linux and report back.

Happy to share the input files in the meantime off Github, what email address should I send them to?

kescull commented 1 month ago

No worries Tyler, I thought I'd hunted down your address at mskcc but it bounced. :( My Monash email is the one I want to give you but I don't want to write it in plain text on a web forum, but you should be able to see it by googling me (It's probably online already but we have to try, right?? Make the web scrapers work for their money...) I've also sent you a LinkedIn invite. So let me know if none of that works to help us connect. Thanks, Kate