Created by Gitlab User agomezco:
When a supercollider command containing thread type DWr_vmovdqu is executed, a trap is generated signaling an invalid opcode. The algorithm of this thread type uses the vmovdqu instruction. Atom cores support only a subset of the variants of this instruction.
As an example, here there are two terminal outputs, one from PuTTY using the network, and the second one with PuTTY using a serial port. The first one shows supercollider running and displaying the status of the different threads. The thread running the DWr_vmovdqu algorithm doesn´t execute any cycles, even after the test has completed (Time equals one minute), this is because this process produced a trap. This thread has pid = 4759.
In the second terminal we see the trap message associated with supercollider process with pid=4759. It indicates that an invalid opcode has been detected. After running the top command we see that process 4759 is down, but all the other processes spawned by supercollider continue to run, even after the test has finished. We need to kill these processes manually in order for the test to end.
rootGitlab User sut:~>killall -SIGKILL supercollider
rootGitlab User sut:~>
The VMOVDQU instruction is part of the AVX instruction set. Tremont Atom core (used by SNR) supports AVX instructions with 128 bit FP and integer values. There are also versions of these instructions with 256 bit values. Also AVX3.x (aka AVX512) is not supported in Atom cores. Supercollider hardcodes the VMOVDQU instruction inside the DWr_vmovdqu algorithm. The hypothesis here is that this coding corresponds to an AVX instruction that is not supported by the Tremont Atom core.
If we remove the DWr_vmovdqu thread type from the user json file, this issue is not seen. This is the approach followed by Pysces. In the Pysces configuration file for SNR there is no mention of DWr_vmovdqu thread type, so the supercollider commands generated by Pysces in this platform do not contain this algorithm.
The goal of this issue is to start a discussion about Xeon vs Atom instruction set differences, and where these differences need to be taken into account. One option is to have supercollider handle properly unsupported instructions. The other option is to remove from the user json file those algorithms that are not supported for a particular processor.
Created by Gitlab User agomezco: When a supercollider command containing thread type DWr_vmovdqu is executed, a trap is generated signaling an invalid opcode. The algorithm of this thread type uses the vmovdqu instruction. Atom cores support only a subset of the variants of this instruction.
As an example, here there are two terminal outputs, one from PuTTY using the network, and the second one with PuTTY using a serial port. The first one shows supercollider running and displaying the status of the different threads. The thread running the DWr_vmovdqu algorithm doesn´t execute any cycles, even after the test has completed (Time equals one minute), this is because this process produced a trap. This thread has pid = 4759.
In the second terminal we see the trap message associated with supercollider process with pid=4759. It indicates that an invalid opcode has been detected. After running the top command we see that process 4759 is down, but all the other processes spawned by supercollider continue to run, even after the test has finished. We need to kill these processes manually in order for the test to end.
** PuTTY connected using the network: ****
SUPERCOLLIDER: Ways=1 Sets=6 DfltAddrAlgo=DFT Time=000:01:00
Thrd Type Ways Sets Tgt Of Attr Instr Agent Loops
0 DWr- 1 6 4 38 WT default S0 C00 T0 pid= 4748 41975951 1 DRd8w 1 6 4 40 WT NT S0 C01 T0 pid= 4758 40082030 2 DWr 1 6 3 19 WC vmovdqu S0 C02 T0 pid= 4759 0 3 DWrL- 1 6 6 0 UC stos S0 C03 T0 pid= 4760 2221706 4 DRdL 1 6 0 0 WB vmovdqa64 S0 C04 T0 pid= 4761 100743291 5 PrefNTA 1 6 3 18 WC default S0 C05 T0 pid= 4762 124776051 7 CRd 1 6 2 63 WP default S0 C07 T0 pid= 4763 54125452 9 Idle 1 6 4 0 WT default S0 C09 T0 pid= 4764 231532155 10 DWr- 1 6 2 7 WP default S0 C10 T0 pid= 4765 41178056 11 CBkRd 1 65536 5 0 WP default S0 C11 T0 pid= 4766 466 12 CfgRd 1 6 3 0 WC default S0 C12 T0 pid= 4767 1155933 15 DRd8w 1 6 1 16 UC default S0 C15 T0 pid= 4768 53571364 16 Monitor 1 6 2 0 WP default S0 C16 T0 pid= 4769 22569584 17 DWr- 1 6 1 25 UC default S0 C17 T0 pid= 4770 53211262 18 PrefT2 1 6 2 7 WP default S0 C18 T0 pid= 4771 245329213 19 Idle 1 6 0 0 WB default S0 C19 T0 pid= 4772 112134306 20 SMC 1 6 4 21 WT default S0 C20 T0 pid= 4773 12199810 23 DWr- 1 6 4 38 WT default S0 C23 T0 pid= 4774 39091355
Note: Updater=t19. TrigPort=F8. PPID=4748.
Completed 1 minutes, 0 seconds. Stopping test...
** PuTTY connected using serial port: ****
rootGitlab User sut:~>[609942.852393] traps: supercollider[4759] trap invalid opcode ip:555555592c35 sp:7fffffffd280 error:0 in supercollider[555555589000+61000]
rootGitlab User sut:~>top top - 19:00:40 up 7 days, 1:28, 3 users, load average: 16.01, 7.38, 2.85 Tasks: 260 total, 18 running, 241 sleeping, 0 stopped, 1 zombie %Cpu(s): 70.8 us, 0.0 sy, 0.0 ni, 29.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 7770.9 total, 7170.6 free, 243.8 used, 356.6 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 7409.7 avail Mem
4748 root 20 0 65772 36732 5040 R 100.0 0.5 2:50.84 superco+ 4760 root 20 0 65772 31788 96 R 100.0 0.4 2:50.61 superco+ 4761 root 20 0 65772 34184 2492 R 100.0 0.4 2:50.61 superco+ 4762 root 20 0 65772 34140 2448 R 100.0 0.4 2:50.61 superco+ 4763 root 20 0 65772 34072 2380 R 100.0 0.4 2:50.61 superco+ 4764 root 20 0 65772 34048 2356 R 100.0 0.4 2:50.61 superco+ 4765 root 20 0 65772 34124 2432 R 100.0 0.4 2:50.61 superco+ 4766 root 20 0 66284 34040 2348 R 100.0 0.4 2:50.61 superco+ 4767 root 20 0 65772 31788 96 R 100.0 0.4 2:50.61 superco+ 4768 root 20 0 65772 34132 2440 R 100.0 0.4 2:50.61 superco+ 4770 root 20 0 65772 34028 2336 R 100.0 0.4 2:50.61 superco+ 4771 root 20 0 65772 31788 96 R 100.0 0.4 2:50.61 superco+ 4773 root 20 0 65772 34056 2364 R 100.0 0.4 2:50.61 superco+ 4758 root 20 0 65772 34160 2468 R 99.7 0.4 2:50.60 superco+ 4769 root 20 0 65776 34032 2340 R 99.7 0.4 2:50.60 superco+ 4772 root 20 0 65772 34212 2520 R 99.7 0.4 2:50.60 superco+ 4774 root 20 0 65772 34056 2364 R 99.7 0.4 2:50.60 superco+
rootGitlab User sut:~>killall -SIGKILL supercollider rootGitlab User sut:~>
The VMOVDQU instruction is part of the AVX instruction set. Tremont Atom core (used by SNR) supports AVX instructions with 128 bit FP and integer values. There are also versions of these instructions with 256 bit values. Also AVX3.x (aka AVX512) is not supported in Atom cores. Supercollider hardcodes the VMOVDQU instruction inside the DWr_vmovdqu algorithm. The hypothesis here is that this coding corresponds to an AVX instruction that is not supported by the Tremont Atom core.
If we remove the DWr_vmovdqu thread type from the user json file, this issue is not seen. This is the approach followed by Pysces. In the Pysces configuration file for SNR there is no mention of DWr_vmovdqu thread type, so the supercollider commands generated by Pysces in this platform do not contain this algorithm.
The goal of this issue is to start a discussion about Xeon vs Atom instruction set differences, and where these differences need to be taken into account. One option is to have supercollider handle properly unsupported instructions. The other option is to remove from the user json file those algorithms that are not supported for a particular processor.