cmmr / virmap

GNU Affero General Public License v3.0
25 stars 14 forks source link

Segmentation fault errors #16

Open jingzhejiang opened 4 years ago

jingzhejiang commented 4 years ago

Hi,

I run virmap on c5d.24xlarge (96 vCPU, 192 GiB, 4* 800Gb) instance with an 838Gb SWAP setting

# swap setting
$ sudo mkswap -f /dev/nvme3n1; \
$ sudo swapon /dev/nvme3n1
$ free -h
              total        used        free      shared  buff/cache   available
Mem:           188G         77G         33G         74M         76G        109G
Swap:          838G        2.2G        835G

It works for many libraries, but didn't for another similar size libraries (maybe with too many viral contigs ? I'm not sure.) For the error libraries, they threw out many empty files after 12S-WGA.filtered.fa step.

$ ll -th 12S-WGA/
-rw-r--r-- 1 ec2-user ec2-user 2.0K 11月  5 13:28 12S-WGA.err
-rw-r--r-- 1 ec2-user ec2-user 1.3K 11月  5 13:28 12S-WGA.log
-rw-r--r-- 1 ec2-user ec2-user  607 11月  5 13:28 12S-WGA.taxonomy.err.bz2
-rw-r--r-- 1 ec2-user ec2-user    0 11月  5 13:27 12S-WGA.final.fa
-rw-r--r-- 1 ec2-user ec2-user    0 11月  5 13:27 12S-WGA.diamondBlastx.out.bz2
-rw-r--r-- 1 ec2-user ec2-user 1.3K 11月  5 13:27 12S-WGA.diamondBlastx.err
-rw-r--r-- 1 ec2-user ec2-user   14 11月  5 13:27 12S-WGA.blastn.out.bz2
-rw-r--r-- 1 ec2-user ec2-user   34 11月  5 13:27 12S-WGA.blastn.err
-rw-r--r-- 1 ec2-user ec2-user  915 11月  5 13:27 12S-WGA.selfAlign.err.bz2
-rw-r--r-- 1 ec2-user ec2-user    0 11月  5 13:27 12S-WGA.selfAlign.fa
-rw-r--r-- 1 ec2-user ec2-user 1.4K 11月  5 13:27 12S-WGA.iterateImprove.err.bz2
-rw-r--r-- 1 ec2-user ec2-user    0 11月  5 13:27 12S-WGA.improved.fa
-rw-r--r-- 1 ec2-user ec2-user 6.6K 11月  5 13:27 12S-WGA.filter.err.bz2
-rw-r--r-- 1 ec2-user ec2-user    0 11月  5 13:21 12S-WGA.filtered.fa
-rw-r--r-- 1 ec2-user ec2-user 283M 11月  5 13:21 12S-WGA.blastnFilter.out.bz2
-rw-r--r-- 1 ec2-user ec2-user  58M 11月  5 11:56 12S-WGA.diamondFilter.out.bz2
-rw-r--r-- 1 ec2-user ec2-user  70M 11月  5 11:44 12S-WGA.combined.fa
-rw-r--r-- 1 ec2-user ec2-user 1.9M 11月  5 11:44 12S-WGA.combine.err
-rw-r--r-- 1 ec2-user ec2-user 1.6K 11月  5 11:34 12S-WGA.assembly.err.bz2
-rw-r--r-- 1 ec2-user ec2-user  62M 11月  5 11:34 12S-WGA.contigs.fa
-rw-r--r-- 1 ec2-user ec2-user 6.1M 11月  5 09:04 12S-WGA.superScaffolds.err.bz2
-rw-r--r-- 1 ec2-user ec2-user 9.7M 11月  5 09:04 12S-WGA.superScaffolds.fa
-rw-r--r-- 1 ec2-user ec2-user  11M 11月  5 08:58 12S-WGA.superScaffolds.init.fa
-rw-r--r-- 1 ec2-user ec2-user 2.2M 11月  5 08:58 12S-WGA.pseudoScaffolds.fa
-rw-r--r-- 1 ec2-user ec2-user 200K 11月  5 08:52 12S-WGA.both.centroids.txt
-rw-r--r-- 1 ec2-user ec2-user 221M 11月  5 08:49 12S-WGA.aa.sam.bz2
-rw-r--r-- 1 ec2-user ec2-user 3.0K 11月  5 08:26 12S-WGA.bbmap.err
-rw-r--r-- 1 ec2-user ec2-user 8.9M 11月  5 08:26 12S-WGA.nuc.sam.bz2
-rw-r--r-- 1 ec2-user ec2-user 341M 11月  5 08:12 12S-WGA.normalized.fa.bz2
-rw-r--r-- 1 ec2-user ec2-user 2.3K 11月  5 08:12 12S-WGA.normalize.err
-rw-r--r-- 1 ec2-user ec2-user 1.7G 11月  5 08:07 12S-WGA.derep.fa.bz2
-rw-r--r-- 1 ec2-user ec2-user  397 11月  5 08:07 12S-WGA.derep.err
-rw-r--r-- 1 ec2-user ec2-user 2.5G 11月  5 07:39 12S-WGA.fa.bz2

Here is the record of error log file

TIME 12S-WGA decompress: 166.73 seconds
TIME 12S-WGA dereplicate: 1692.31 seconds
TIME 12S-WGA normalize: 281.11 seconds
TIME 12S-WGA bbmap to virus: 822.88 seconds
TIME 12S-WGA diamond to virus: 1364.85 seconds
TIME 12S-WGA construct superscaffolds: 913.65 seconds
TIME 12S-WGA megahit assembly: 9022.49 seconds
TIME 12S-WGA dedupe assembly: 17.12 seconds
TIME 12S-WGA merge assembly: 580.39 seconds
TIME 12S-WGA diamond filter map: 693.05 seconds
TIME 12S-WGA blastn filter map: 5120.28 seconds
sh: line 1: 72606 Segmentation fault      determineTaxonomy.pl filter /scratch/VirmapDb/Taxonomy.virmap /scratch/tmp/MEUE7kr6ri/12S-WGA.entropy.fa /scratch/Virmap_result/12S-WGA/12S-WGA.diamondFilter.out.bz2 /scratch/Virmap_result/12S-WGA/12S-WGA.blastnFilter.out.bz2 96 > /scratch/Virmap_result/12S-WGA/12S-WGA.filtered.fa 2>> /scratch/tmp/MEUE7kr6ri/12S-WGA.filter.err
TIME 12S-WGA filter contigs: 377.65 seconds
FAlite: Empty
TIME 12S-WGA iterative improvement: 6.71 seconds
TIME 12S-WGA self align and quantify: 1.19 seconds
FAlite: Empty
TIME 12S-WGA blastn full: 0.02 seconds
FAlite: Empty
cat: /scratch/tmp/MEUE7kr6ri/12S-WGA.diamondBlastx.out: No such file or directory
FAlite: Empty
FAlite: Empty
FAlite: Empty
lbzip2: skipping "/scratch/tmp/MEUE7kr6ri/12S-WGA.diamondBlastx.out": lstat(): No such file or directory
lbzip2: skipping "/scratch/tmp/MEUE7kr6ri/12S-WGA.diamondBlastx.remain.out": lstat(): No such file or directory
TIME 12S-WGA diamond full: 0.04 seconds
TIME 12S-WGA determine taxonomy: 26.89 seconds
TIME 12S-WGA Overall Virmap time: 21087.35 seconds

I also checked the end of file 12S-WGA.filter.err.bz2, and found error messages at the last line:

...................
THREAD 77 - GOT contig_2641;src=megahit;length=2873;cov=53.3;gc=0.646;taxId=10239
THREAD 90 - GOT gi|NoNucHits|gb|MF418016.1|Bacillus.phage.vB_BceM.HSE3;taxId=2170705;length=168394
THREAD 16 - GOT contig_9275;src=megahit;length=2881;cov=51;merged=1;taxId=10239
THREAD 58 - GOT contig_19539;src=megahit;length=1666;cov=54.1;gc=0.627;taxId=10239
THREAD 67 - GOT gi|NEWDB|gb|KY935474.1|Influenza.A.virus.A.Porto.Alegre.LACENRS.1155.2015.H3N2.segment.7.matrix.protein.2.M2.and.matrix.protein.1.M1.genes.complete.cds.;taxId=1979338;length=1047
Thread 94 terminated abnormally: IO error: While opening a file for sequentially reading: /dev/shm/vBuhJeZAAe/TAX/CURRENT: Too many open files at /usr/local/bin/determineTaxonomy.pl line 458.

Is this a memory error, even after an 838Gb SWAP is set? Can I overcome the problem of memory error by adjusting SWAP setting, or I can only change to some expensive, big memory instance?

ANOTHER IMPORTANT QUESTION is: Can virmap continue to execute from the interrupted steps, thus saving some time and money?

Thank you for your reply!

12S-WGA.err.gz 12S-WGA.filter.err.gz

torptube commented 4 years ago

I would recommend not using swap at all, and turn the NVMe drives into a RAID-0 array for scratch space holding the database and files to be processed. If your databases aren't on a high speed I/O device, the blastn steps could be significantly impacted.

You can use what the installer does to make a RAID-0 array on any system with more than one local SSDs using this: sudo mdadm --create /dev/md0 --level=0 --raid-devices=$(sudo nvme list | grep "Amazon EC2 NVMe Instance Storage" | wc -l) $(sudo nvme list | grep "Amazon EC2 NVMe Instance Storage" | cut -f1 -d " " | tr "\n" " ") sudo mkfs.ext4 /dev/md0 sudo mount /dev/md0 /scratch

Those commands should make a RAID-0 array with whatever local SSDs you have and mount it to /scratch.

Regarding your error: it's not a memory error, you have to raise the open file limit, and I would raise the amount of threads too, just in case.

You can change the limits using ulimit: ulimit -a to find out your current limits ulimit -n 65536 to set max open files to 65535 ulimit -u 4096 to set max threads to 4096

Every system is different as to the hard limits, I don't remember the hard limits on Amazon Linux 2, but I think they are higher than the previous suggestions.

There is no ability to resume from a previous run. Sorry. However, depending on what you are looking for, you could potentially skip the time intensive steps of megahit and iterative improvement. Try running with and without both/either --noAssembly --noIterImp and see if the results are similar enough such that you can just skip one of both of those steps.

jingzhejiang commented 4 years ago

Thank you for your instruction! I just follow your no-SWAP, RAID-0 and ulimit setting For the ulimit setting, I run it like below

# add follow contents to /etc/profile
ulimit -n 1024000
ulimit -u unlimited
ulimit -s unlimited
ulimit -i 255983
ulimit -SH unlimited
ulimit -f unlimited

# add follow contents to /etc/security/limits.conf
* hard nofile 1024000
* soft nofile 1024000
* hard nproc unlimited
* soft nproc unlimited
* soft core 0
* hard core 0
* soft sigpending 255983
* hard sigpending 255983

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 255983
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

After that I use parallel -j 3 --xapply to run virmap (c5d.24xlarge (96 vCPU, 192 GiB, 4* 800Gb) instance), but still encounter errors at 1S-WGA.combined.fa step

the err log file

TIME 1S-WGA decompress: 140.38 seconds
TIME 1S-WGA dereplicate: 238.73 seconds
TIME 1S-WGA normalize: 90.41 seconds
TIME 1S-WGA bbmap to virus: 1484.10 seconds
TIME 1S-WGA diamond to virus: 1204.22 seconds
TIME 1S-WGA construct superscaffolds: 312.40 seconds
TIME 1S-WGA megahit assembly: 5347.56 seconds
TIME 1S-WGA dedupe assembly: 43.31 seconds
TIME 1S-WGA merge assembly: 1427.51 seconds
FAlite: Empty
lbzip2: skipping "/scratch/tmp/sEC7KktVKx/1S-WGA.diamondFilter.out": lstat(): No such file or directory
TIME 1S-WGA diamond filter map: 0.51 seconds
FAlite: Empty
TIME 1S-WGA blastn filter map: 0.09 seconds
TIME 1S-WGA filter contigs: 27.09 seconds
FAlite: Empty
TIME 1S-WGA iterative improvement: 6.56 seconds
TIME 1S-WGA self align and quantify: 4.91 seconds
FAlite: Empty
TIME 1S-WGA blastn full: 0.24 seconds
FAlite: Empty
cat: /scratch/tmp/sEC7KktVKx/1S-WGA.diamondBlastx.out: No such file or directory
FAlite: Empty
FAlite: Empty
FAlite: Empty
lbzip2: skipping "/scratch/tmp/sEC7KktVKx/1S-WGA.diamondBlastx.out": lstat(): No such file or directory
lbzip2: skipping "/scratch/tmp/sEC7KktVKx/1S-WGA.diamondBlastx.remain.out": lstat(): No such file or directory
TIME 1S-WGA diamond full: 0.28 seconds
TIME 1S-WGA determine taxonomy: 27.37 seconds
TIME 1S-WGA Overall Virmap time: 10355.72 seconds

1S-WGA.combine.err

Outer iteration 0.0 of merge
Iteration 0.0 of merge
Called merge with strict

Building a new DB, current time: 11/06/2020 09:53:03
New DB name:   /dev/shm/QZwLVQgAGs/easyMergeDb
New DB title:  /scratch/tmp/nzA_1WN3jC/1S-WGA.combined.fa.prepped.iterationMerge.0.0.fa
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 251296 sequences in 14.7508 seconds.

sh: line 1: 83775 Segmentation fault      easyMerge.pl /scratch/tmp/nzA_1WN3jC/1S-WGA.combined.fa.prepped.iterationMerge.0.0.fa 96 strict > /scratch/tmp/nzA_1WN3jC/1S-WGA.combined.fa.iterationMerge.0.0.fa
FAlite: Empty
END Inner iteration 0.0 of merge
inner before after merge: 125648
inner after after mege count: 0
Iteration 0.1 of merge
FAlite: Empty
FAlite: Empty
Called merge with strict
FAlite: Empty
FAlite: Empty
FAlite: Empty
END Inner iteration 0.1 of merge
inner before after merge: 0
inner after after mege count: 0
Finished inner loop after 0.1 iterations
in was /scratch/tmp/nzA_1WN3jC/1S-WGA.combined.fa.iterationMerge.0.1.fa

BACK to outer loop
FAlite: Empty
END Outer iteration 0.1 of merge
before count 125648
out after count 0
FAlite: Empty
Outer iteration 1.1 of merge
FAlite: Empty
Iteration 1.1 of merge
FAlite: Empty
FAlite: Empty
Called merge with strict
FAlite: Empty
FAlite: Empty
FAlite: Empty
END Inner iteration 1.1 of merge
inner before after merge: 0
inner after after mege count: 0
Finished inner loop after 1.1 iterations
in was /scratch/tmp/nzA_1WN3jC/1S-WGA.combined.fa.iterationMerge.1.1.fa

BACK to outer loop
FAlite: Empty
END Outer iteration 1.1 of merge
before count 0
out after count 0
Finished merging after 1 iterations
printing infile /scratch/tmp/nzA_1WN3jC/1S-WGA.combined.fa.iterationMerge.1.1.fa
FAlite: Empty

I see Segmentation fault again. Is it because I didn't set ulimit correctly?

Thank you again!

Ginger

torptube commented 4 years ago

Hmm, this one is tricky.

Would you be able to post a compressed version of the .combined.fa somewhere? I would need to test it, since there doesn't seem to be an easy explanation for this one, easyMerge.pl itself might be crashing, but it's in perl, so that seems unlikely. It could also be something it uses, but that would need inspecting from your actual files, since some of your other samples go through without any issues right?

In the meantime, I made some changes to where I suspect easyMerge.pl is failing, you'll need to pull the repo again to get the updated version of easyMerge.pl. Hopefully the fixes work.

jingzhejiang commented 4 years ago

Hi Matt,

Thank you for your help! Yes, some of my other samples (especially for those with fewer and smaller viral contigs) go through without any issues right! The 1S-WGA.combined.fa file is empty. So I feel that the mistake should happen in the combine step. I publicly uploaded all the files and logs generated during the assembly of the 1S-WGA library to my S3 bucket (address has been sent to your email: torptube@gmail.com) I hope you can find some clues.

Ginger

torptube commented 3 years ago

It's failing to combine the pseudo-constructed assembly and the de-novo assembly. The segfault happens in easyMerge.pl, but I am not sure what is causing it.

I didn't get an email with a link to the s3 bucket.

Did you update your easyMerge.pl on your VM image? I updated it with some speculative fixes, and a little more robust error reporting to try and track down the error, can you pull it and run your sample again?

jingzhejiang commented 3 years ago

Thank you for your reply! I just sent another email via my Gmail box (jingzhejiang@gmail.com) I haven't tested the new easyMerge.pl. I'll try, and paste here if there is any update progress.

jingzhejiang commented 3 years ago

Hi Matt,

I still encounter the same errors at 1S-WGA.combined.fa step, even after I update your new easyMerge.pl. Below are the intermediate files generated before the error came out. Hope they are enough for you to analyze the problem. Thank you!

1S-WGA.combined.fa 0.0B
https://ginger-ohio.s3-accelerate.amazonaws.com/JJZ/nanhai/VirMAP/1S-WGA/1S-WGA.combined.fa.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAI6UOUQUYB2IQ6HLA%2F20201223%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20201223T014249Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=6e4f07e163e8dd552bf244569d2f17e6d884f0c56bbecf93cad22dbe6726eed8 1S-WGA.combine.err 1.5KB
https://ginger-ohio.s3-accelerate.amazonaws.com/JJZ/nanhai/VirMAP/1S-WGA/1S-WGA.combine.err?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAI6UOUQUYB2IQ6HLA%2F20201223%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20201223T014337Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=9288187a2fe9b6c738662296b8e9cab4165b43d96c81aba483f5ddc7b0d63842 1S-WGA.assembly.err.bz2 1.6KB
https://ginger-ohio.s3-accelerate.amazonaws.com/JJZ/nanhai/VirMAP/1S-WGA/1S-WGA.assembly.err.bz2?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAI6UOUQUYB2IQ6HLA%2F20201223%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20201223T014505Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=fbe2dc939b6b43cceb6a08620403bd82ef78f6d306202a9a5a455590b0e09d2b 1S-WGA.contigs.fa 34.7MB
https://ginger-ohio.s3-accelerate.amazonaws.com/JJZ/nanhai/VirMAP/1S-WGA/1S-WGA.contigs.fa.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAI6UOUQUYB2IQ6HLA%2F20201223%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20201223T014626Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=1f1c7038b7ac06083404fcb9980f5a13a3c51615ac9930c2370d8948ec7f3728 1S-WGA.superScaffolds.err.bz2 6MB https://ginger-ohio.s3-accelerate.amazonaws.com/JJZ/nanhai/VirMAP/1S-WGA/1S-WGA.superScaffolds.err.bz2?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAI6UOUQUYB2IQ6HLA%2F20201223%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20201223T015043Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=bc156c04bdc12ae12f9610e46cba9e0504bb661710345f741d6004ea6be8f4b4 1S-WGA.superScaffolds.fa 219.4 KB
https://ginger-ohio.s3-accelerate.amazonaws.com/JJZ/nanhai/VirMAP/1S-WGA/1S-WGA.superScaffolds.fa.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAI6UOUQUYB2IQ6HLA%2F20201223%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20201223T014816Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=82cb69f6139091d99864ad2312c6bd292177fbae9892c6fccee053ae33979f71

I also sent a full list of intermediate files to your email (torptube@gmail.com) via my Gmail box (jingzhejiang@gmail.com). Thank you again!

Ginger

jingzhejiang commented 3 years ago

Hi Matt,

I know you must be busy. I just want to make sure that you got my email. In that email, I updated the S3 download path of those files. I hope you can see it. Thank you for your efforts and look forward to good results.

Ginger

torptube commented 3 years ago

Oh man, sorry, yeah, been very busy lately. I haven't had time to download and evaluate the intermediate files. Can you refresh the links?

On Tue, Dec 15, 2020, 6:59 PM jingzhejiang notifications@github.com wrote:

Hi Matt,

I know you must be busy. I just want to make sure that you got my email. In that email, I updated the S3 download path of those files. I hope you can see it. Thank you for your efforts and look forward to good results.

Ginger

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmmr/virmap/issues/16#issuecomment-745694067, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWCS63YOFYVATDLZK37LSVAA53ANCNFSM4TL5IOGQ .

jingzhejiang commented 3 years ago

Of course! I have refreshed above links, and these links will validate for a week. Thank you!

Oh man, sorry, yeah, been very busy lately. I haven't had time to download and evaluate the intermediate files. Can you refresh the links?

torptube commented 3 years ago

Hi Ginger,

I tried to download the links above, but it didn't work, can you refresh again?

Cheers, Matt

On Wed, Dec 16, 2020 at 12:48 AM jingzhejiang notifications@github.com wrote:

Of course! I have refreshed above links, and these links will validate for a week. Thank you!

Oh man, sorry, yeah, been very busy lately. I haven't had time to download and evaluate the intermediate files. Can you refresh the links? … <#m8035321443290211811> On Tue, Dec 15, 2020, 6:59 PM jingzhejiang @.***> wrote: Hi Matt, I know you must be busy. I just want to make sure that you got my email. In that email, I updated the S3 download path of those files. I hope you can see it. Thank you for your efforts and look forward to good results. Ginger — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#16 (comment) https://github.com/cmmr/virmap/issues/16#issuecomment-745694067>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWCS63YOFYVATDLZK37LSVAA53ANCNFSM4TL5IOGQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmmr/virmap/issues/16#issuecomment-745804199, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWCRU55G7CKAUI6UXLWTSVBJ5RANCNFSM4TL5IOGQ .

jingzhejiang commented 3 years ago

YES, just updated. Thank you!

Hi Ginger, I tried to download the links above, but it didn't work, can you refresh again? Cheers, Matt On Wed, Dec 16, 2020 at 12:48 AM jingzhejiang notifications@github.com wrote: Of course! I have refreshed above links, and these links will validate for a week. Thank you! Oh man, sorry, yeah, been very busy lately. I haven't had time to download and evaluate the intermediate files. Can you refresh the links? … <#m8035321443290211811> On Tue, Dec 15, 2020, 6:59 PM jingzhejiang @.***> wrote: Hi Matt, I know you must be busy. I just want to make sure that you got my email. In that email, I updated the S3 download path of those files. I hope you can see it. Thank you for your efforts and look forward to good results. Ginger — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#16 (comment) <#16 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWCS63YOFYVATDLZK37LSVAA53ANCNFSM4TL5IOGQ . — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#16 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWCRU55G7CKAUI6UXLWTSVBJ5RANCNFSM4TL5IOGQ .

torptube commented 3 years ago

Hi Ginger,

I tried to download from the links, but it still says expired. Can you post new links?

Cheers, Matt

On Tue, Dec 22, 2020 at 7:52 PM jingzhejiang notifications@github.com wrote:

YES, just updated. Thank you!

Hi Ginger, I tried to download the links above, but it didn't work, can you refresh again? Cheers, Matt On Wed, Dec 16, 2020 at 12:48 AM jingzhejiang notifications@github.com wrote: … <#m-6831877631870059354> Of course! I have refreshed above links, and these links will validate for a week. Thank you! Oh man, sorry, yeah, been very busy lately. I haven't had time to download and evaluate the intermediate files. Can you refresh the links? … <#m8035321443290211811> On Tue, Dec 15, 2020, 6:59 PM jingzhejiang @.***> wrote: Hi Matt, I know you must be busy. I just want to make sure that you got my email. In that email, I updated the S3 download path of those files. I hope you can see it. Thank you for your efforts and look forward to good results. Ginger — You are receiving this because you commented. Reply to this email directly, view it on GitHub <

16 https://github.com/cmmr/virmap/issues/16 (comment) <#16 (comment)

https://github.com/cmmr/virmap/issues/16#issuecomment-745694067>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWCS63YOFYVATDLZK37LSVAA53ANCNFSM4TL5IOGQ . — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#16 (comment) https://github.com/cmmr/virmap/issues/16#issuecomment-745804199>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWCRU55G7CKAUI6UXLWTSVBJ5RANCNFSM4TL5IOGQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmmr/virmap/issues/16#issuecomment-749876041, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWCQNZMRU5HQRT4GRODDSWFEPJANCNFSM4TL5IOGQ .

jingzhejiang commented 3 years ago

Hi Matt

I have sent them to your gmail box (torptube@gmail.com). Let me know if it doesn't work, thank you!

Ginger

jingzhejiang commented 3 years ago

Hi Matt

Hope you are doing well! Is there any progress? I've sent the data to your email (torptube@gmail.com) on Dec 24, 2020. Wish you good luck! Thank you!

Ginger

jingzhejiang commented 3 years ago

Hi Bro,

Hoping you are doing well! Have you reproduced my bug on EC2 instance? I am just waiting for the solution for my data. If it is impossible to solve, I have to turn to other assembly tools. Thank you again!

Ginger

torptube commented 3 years ago

Hi Ginger,

Sorry for the delay. Gonna work on this on EC2 today. Hopefully I will have a solution for you soon.

Cheers, Matt

torptube commented 3 years ago

OK, so I have a putative fix up, but the issue was actually running out of RAM on a c5d.9xlarge instance.

easyMerge.pl was horrible unoptimized, so I changed a few things to make it thread better and be a little more RAM efficient per thread, but given your dataset, I can't get it under 8GB/thread. So given the RAM/thread allocation on the c5's you will be underutilized threadwise, since you can only use around 20% of the threads per machine on this step.

I am going to test on the r5d.24xlarge to see how scaling works given a much higher RAM/thread allocation.

But in the meantime, if should work if you want to run it under your current VM conditions.

torptube commented 3 years ago

Wait, don't use that version.

It "succeeds" at easyMerge.pl, but there are problems with the output that I didn't see last night and it fails mergeWrapper.pl.

I'll revert for now and work on a fix.

jingzhejiang commented 3 years ago

Thank you very much! Waiting for your good news!

jingzhejiang commented 3 years ago

Hi, Matt! Are there any progresses? : P

jingzhejiang commented 3 years ago

Hello, Matt! I wish you well! Do you think it is a bug that can be solved in a short time?