Psy-Fer / interARTIC

InterARTIC - An interactive local web application for viral whole genome sequencing utilising the artic network pipelines..
https://psy-fer.github.io/interARTIC/
MIT License
29 stars 7 forks source link

Gupplyplex. illegal instruction, core dump #69

Open omarkr8 opened 2 years ago

omarkr8 commented 2 years ago

Finding strange output at the first step of the pipeline, im assuming it is trying to use guppyplex to collect demux data.

output looks like RUNNING GUPPYPLEX COMMAND Illegal instruction (core dump)

and that is repeated many many times, presumably for every barcode.

I do not believe this is an issue with the pipeline per se, because i can run the same parameters and data on a different machine with no obvious issues at this step. So the question is, what is it with my VMM Ubuntu setup on this problematic machine, that is different to the machine that worked (I have no idea what parts to look at that could be important).

the reason im using this problematic machine is because its meant to have a more powerful CPU, so i just hope this core dump issue is just a matter to reallocating resources.

hasindu2008 commented 2 years ago

This kind of error was observed in ARM-v8 processors due to openBLAS core type not being recognised correctly, which was then fixed. Is your processor Intel or ARM? Can you send the details of your processor - output of cat /proc/cpuinfo?

omarkr8 commented 2 years ago

So i've given Ubuntu 3 processors using VMM...

processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz stepping : 7 microcode : 0xffffffff cpu MHz : 2095.078 cache size : 11264 KB physical id : 0 siblings : 3 core id : 0 cpu cores : 3 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_1 sse4_2 hypervisor lahf_lm invpcid_single ibrs_enhanced fsgsbase invpcid md_clear flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit bogomips : 4190.15 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz stepping : 7 microcode : 0xffffffff cpu MHz : 2095.078 cache size : 11264 KB physical id : 0 siblings : 3 core id : 1 cpu cores : 3 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_1 sse4_2 hypervisor lahf_lm invpcid_single ibrs_enhanced fsgsbase invpcid md_clear flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit bogomips : 4190.15 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz stepping : 7 microcode : 0xffffffff cpu MHz : 2095.078 cache size : 11264 KB physical id : 0 siblings : 3 core id : 2 cpu cores : 3 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_1 sse4_2 hypervisor lahf_lm invpcid_single ibrs_enhanced fsgsbase invpcid md_clear flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit bogomips : 4190.15 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

hasindu2008 commented 2 years ago

Looking at the processor spec https://ark.intel.com/content/www/us/en/ark/products/193390/intel-xeon-silver-4208-processor-11m-cache-2-10-ghz.html it supports instruction extensions Intel® SSE4.2, Intel® AVX, Intel® AVX2, Intel® AVX-512 but in the output you sent, only sse4.2 is available. But modules like numpy and TensorFlow relies on AVX extensions to run, which seems not to be available inside your VMM. To verify this possibility could you answer the following questions.

Is that cat /proc/cpuinfo you sent above taken inside the virtual machine?

Which hypervisor/Virtual Machine Manager are you using?

What is the host operating system?

Can you also do a cat /proc/cpuinfo on the host operating system?

omarkr8 commented 2 years ago

the cat output was in the VMM im using Oracle VM virtualbox Manager

host machine is a Windows 10 Pro. I cant cat/proc on command line. perhaps an alternate command "wmic cpu" gives me

Intel64 Family 6 Model 85 Stepping 7 Intel(R) Xeon (R) Silver 4208 CPU@ 2.10GHz 8 Cores

hasindu2008 commented 2 years ago

Right, seems like Oracle VM VirtualBox Manager is not exposing AVX and this might be the cause https://stackoverflow.com/questions/65780506/how-to-enable-avx-avx2-in-virtualbox-6-1-16-with-ubuntu-20-04-64bit.

However, if you are running on Windows 10, there is another easy way to run interARTIC without any VMs - natively on Windows using WSL. Have you got WSL installed - if not it is pretty straight forward https://www.windowscentral.com/install-windows-subsystem-linux-windows-10. This is the method I use to run interARTIC on my Windows laptop.

omarkr8 commented 2 years ago

I personally prefers the WSL2 natively on windows method too, but i do these pipelines on communal lab computers, so i'll have to have a think about how having both (or swapping) will affect everyone.

for now, let me try that Oracle AVX route. If it worked for our other com, should work for this one too.

hasindu2008 commented 2 years ago

Another option is HyperV that comes inbuilt with Windows 10 - Microsoft's own VMM. Have you tried that before? Seems like it this this built-in hyperV that causes issues with the virtual box.

hasindu2008 commented 2 years ago

This is anther article about making virtual box work with AVX https://forums.virtualbox.org/viewtopic.php?f=25&t=99390

omarkr8 commented 2 years ago

hmm following the steps in those posts and looking at icons (the turtle icon is now a V chip icon).

but im still getting the core dump error. so something else must be interfering with this. Let me have a look at what else this machine has.. There's a docker on windows that tries to start up at launch, could that be relevant?

hasindu2008 commented 2 years ago

Can you cat /proc/cpuinfo again and see if AVX appeared?

hasindu2008 commented 2 years ago

Perhaps Could you try this https://petri.com/how-to-disable-hyper-v-completely-in-windows-10?

omarkr8 commented 2 years ago

Looks like it has.

processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz stepping : 7 cpu MHz : 2095.080 cache size : 11264 KB physical id : 0 siblings : 3 core id : 0 cpu cores : 3 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs taa itlb_multihit bogomips : 4190.16 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz stepping : 7 cpu MHz : 2095.080 cache size : 11264 KB physical id : 0 siblings : 3 core id : 1 cpu cores : 3 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs taa itlb_multihit bogomips : 4190.16 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz stepping : 7 cpu MHz : 2095.080 cache size : 11264 KB physical id : 0 siblings : 3 core id : 2 cpu cores : 3 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs taa itlb_multihit bogomips : 4190.16 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

hasindu2008 commented 2 years ago

Seems like AVX has now appeared after that change and it is weird why it is still coming. Which Ubuntu version are you using? Let me try the same setup on my laptop.

hasindu2008 commented 2 years ago

Also could you please do the following commands inside your VM uname -a lsb_release -a

omarkr8 commented 2 years ago

Linux lab6-VirtualBox 5.11.0-37-generic #41-Ubuntu SMP Mon Sep 20 16:39:20 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 21.04 Release: 21.04 Codename: hirsute

hasindu2008 commented 2 years ago

Everything looks normal and I tried on my laptop with oracle Virtual box and even with the tortoise symbol it ran. image

Could you send the about page (like below), just in case? image

Could you zip your virtual machine so that I could reproduce this issue? If your machine contains information that you cannot share, I can share mine so you could test it on yours - https://www.dropbox.com/s/ezi8j78t9drtg5o/test_VM.zip?dl=0, the password is just aa. Interartic is already inside home directory and a test dataset is inside /data (sample barcode csv as well ) which you directly run the test on (select multiple samples, "already guppy multiplexed", ARTIC v3 and ligation library prep)

One more thing, if you could kindly test with WSL on your host machine, we could isolate the problem - that is if it is something to do with your processor/host or the VM.

omarkr8 commented 2 years ago

I'll send the about page tomorrow. Im curious about issues from having multiple systems on the same machine like that. Is it really okay to have WSL ubuntu AND a VMM ubuntu? do they share the same directories?

I had a look at the Docker desktop, turns out it was another VMM, not sure what it's running. is it possible it was interfering with resources for the oraclebox?

I wonder if this isnt a problem that simply just needs reinstalling stuff.

hasindu2008 commented 2 years ago

If you have a docker desktop it is very likely that WSL is already installed. So you will need to just install ubuntu from the windows store. They do not share the same directories so you can have both without any issue.

Docker might be enable HyperV in Windows which is Window's inbuilt VMM that is likely to interfere with oracle virtual box.