Data that used to work, now killed.

loneknightpy / idba

124 stars 53 forks source link

Data that used to work, now killed. #62

Open jwasmuth opened 3 years ago

jwasmuth commented 3 years ago

Hi, A few years ago I used idba successfully, now on a new server and every meaningful dataset I have (pe len=100bp) ends with the 'killed' statement. Troubleshooting this is difficult. Would someone mind sharing a dataset that they know works with idba v 1.1.3? Many thanks James

jwasmuth commented 3 years ago

I found the data I used a couple of years ago. Sadly, it doesn't work with 1.1.3 on the new server. No idea why. 1m PE reads of 100 bp. Job killed while being read. Run on a 3TB RAM machine. I tracked the memory usage before the job is killed and it doesn't get close to the max. I welcome any ideas anyone may have.

th-of commented 3 years ago

I found the data I used a couple of years ago. Sadly, it doesn't work with 1.1.3 on the new server. No idea why. 1m PE reads of 100 bp. Job killed while being read. Run on a 3TB RAM machine. I tracked the memory usage before the job is killed and it doesn't get close to the max. I welcome any ideas anyone may have.

Are you recompiling the source code on the system you are currently on? Or running an old binary? Remember that idba only supports an interleaved fasta file. Another explanation is a corrupted dataset, use a text editor like vim (or head and tail in bash) to see that sequences are matched and contain headers.

No problems on my side.

jwasmuth commented 3 years ago

I have tried recompiling and, separately, installing from conda. Both give the same result. Extensive running with different number of input sequences shows that the larger the input file the quicker it gets killed. Though I don't know why.

I did get it working on the old server last night, so I know that the sequence file isn't corrupted. It may be something with the configuration of the new server, which I don't have any privileges for.