Terraspace / UASM

UASM - Macro Assembler
http://www.terraspace.co.uk/uasm.html
Other
220 stars 49 forks source link

Improper AVX512 code generation #35

Closed gwoltman closed 7 years ago

gwoltman commented 8 years ago

This code:

vpand zmm2, zmm2, zmm0 vpmovmskb rcx, zmm2 sub rdx, rdx

Produces this in objconv disassembly:

?_236:; vpandd zmm2, zmm2, zmm0 ; 4CB3 _ 62 F1 6D 48: DB. D0 db 62H, 0F1H, 6DH, 48H, 0DBH, 0D0H ; Error: This instruction is not allowed in 64 bit mode ; Error: VEX.mmmm bits out of range ; Note: Prefix bit or byte has no meaning in this context ; Note: VEX prefix bits not allowed here ; Warning: MVEX prefix not allowed for this instruction ; Warning: MVEX prefix option bits not allowed here ; dec r8d ; 4CB9 _ 62 C4 F1 FD: 48 db 62H, 0C4H, 0F1H, 0FDH, 48H ; xlatb ; 4CBE _ D7 db 0D7H ; retf 11080 ; 4CBF _ CA, 2B48 db 0CAH, 48H, 2BH

john-terraspace commented 8 years ago

Hi,

Technically there are no instructions like that:

Vpand cannot take zmm registers, neither can vpmovmskb, those can only be used up to YMM registers (AVX2).

The AVX512 variants are different instructions like vpandd etc..

However it’s a good spot in that we really shouldn’t let them assemble in the first place!

From: gwoltman [mailto:notifications@github.com] Sent: 30 September 2016 08:42 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Subject: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

This code:

vpand zmm2, zmm2, zmm0 vpmovmskb rcx, zmm2 sub rdx, rdx

Produces this in objconv disassembly:

?236:; vpandd zmm2, zmm2, zmm0 ; 4CB3 62 F1 6D 48: DB. D0 db 62H, 0F1H, 6DH, 48H, 0DBH, 0D0H ; Error: This instruction is not allowed in 64 bit mode ; Error: VEX.mmmm bits out of range ; Note: Prefix bit or byte has no meaning in this context ; Note: VEX prefix bits not allowed here ; Warning: MVEX prefix not allowed for this instruction ; Warning: MVEX prefix option bits not allowed here ; dec r8d ; 4CB9 62 C4 F1 FD: 48 db 62H, 0C4H, 0F1H, 0FDH, 48H ; xlatb ; 4CBE D7 db 0D7H ; retf 11080 ; 4CBF _ CA, 2B48 db 0CAH, 48H, 2BH

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35 , or mute the thread https://github.com/notifications/unsubscribe-auth/AQGQVF7kH3lyMsL_8k5L1Fo-GL2XWfloks5qvWX_gaJpZM4KLcVC .

john-terraspace commented 8 years ago

I’ve updated the packages with a fix that prevents these from assembling.

Regards,

John

From: gwoltman [mailto:notifications@github.com] Sent: 30 September 2016 08:42 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Subject: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

This code:

vpand zmm2, zmm2, zmm0 vpmovmskb rcx, zmm2 sub rdx, rdx

Produces this in objconv disassembly:

?236:; vpandd zmm2, zmm2, zmm0 ; 4CB3 62 F1 6D 48: DB. D0 db 62H, 0F1H, 6DH, 48H, 0DBH, 0D0H ; Error: This instruction is not allowed in 64 bit mode ; Error: VEX.mmmm bits out of range ; Note: Prefix bit or byte has no meaning in this context ; Note: VEX prefix bits not allowed here ; Warning: MVEX prefix not allowed for this instruction ; Warning: MVEX prefix option bits not allowed here ; dec r8d ; 4CB9 62 C4 F1 FD: 48 db 62H, 0C4H, 0F1H, 0FDH, 48H ; xlatb ; 4CBE D7 db 0D7H ; retf 11080 ; 4CBF _ CA, 2B48 db 0CAH, 48H, 2BH

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35 , or mute the thread https://github.com/notifications/unsubscribe-auth/AQGQVF7kH3lyMsL_8k5L1Fo-GL2XWfloks5qvWX_gaJpZM4KLcVC .

gwoltman commented 8 years ago

So much for my lazy plan to upgrade my AVX2 code to AVX512 using search YMM and replace ZMM.

Anyway, more for you:

VPTEST should also reject zmm registers.

And the legal instruction: vpcmpeqq zmm2, zmm2, zmm1 generates this in objconv:

; Error: This instruction is not allowed in 64 bit mode ; Error: VEX.mmmm bits out of range ; Note: Prefix bit or byte has no meaning in this context ; Note: VEX prefix bits not allowed here ; Warning: MVEX prefix not allowed for this instruction ; Warning: MVEX prefix option bits not allowed here ; dec r8d ; 4F5D _ 62 C4 F2 6D: 48 db 62H, 0C4H, 0F2H, 6DH, 48H ; sub ecx, edx ; 4F62 _ 29. D1 db 29H, 0D1H

gwoltman commented 8 years ago

Correction. the vpcmpeqq instruction is not legal.....

john-terraspace commented 8 years ago

Lol yeah I don’t think you’ll get away that easily! :)

Will sort vptest and vpcmpeqq out asap.

From: gwoltman [mailto:notifications@github.com] Sent: 01 October 2016 06:36 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: Re: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

So much for my lazy plan to upgrade my AVX2 code to AVX512 using search YMM and replace ZMM.

Anyway, more for you:

VPTEST should also reject zmm registers.

And the legal instruction: vpcmpeqq zmm2, zmm2, zmm1 generates this in objconv:

; Error: This instruction is not allowed in 64 bit mode ; Error: VEX.mmmm bits out of range ; Note: Prefix bit or byte has no meaning in this context ; Note: VEX prefix bits not allowed here ; Warning: MVEX prefix not allowed for this instruction ; Warning: MVEX prefix option bits not allowed here ; dec r8d ; 4F5D 62 C4 F2 6D: 48 db 62H, 0C4H, 0F2H, 6DH, 48H ; sub ecx, edx ; 4F62 29. D1 db 29H, 0D1H

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35#issuecomment-250925610 , or mute the thread https://github.com/notifications/unsubscribe-auth/AQGQVPX4NOcoQcytwo80d-HK_wPjlC4dks5qvpoTgaJpZM4KLcVC .

gwoltman commented 8 years ago

The legal "vpcmpeqq k3, zmm2, zmm1" is getting the objconv error messages.

john-terraspace commented 8 years ago

Agreed, also

VPCMPEQQ zmm1, zmm2,zmm3

Shouldn’t assemble as it’s not valid.

From: gwoltman [mailto:notifications@github.com] Sent: 01 October 2016 06:57 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: {Spam?} Re: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

The legal "vpcmpeqq k3, zmm2, zmm1" is getting the objconv error messages.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35#issuecomment-250926868 , or mute the thread https://github.com/notifications/unsubscribe-auth/AQGQVK8s7y63kZ-gyG1HOEZf4wOCazLgks5qvp73gaJpZM4KLcVC .

gwoltman commented 8 years ago

vpxor must also reject zmm registers (i guess you'll need to check all the bitwise AVX2 operators)

john-terraspace commented 8 years ago

I’ve checked a few already and some others seem fine like por / vpor.. but I’ll go through and check others too. Vpxor is fixed now too, just waiting for Habran’s update for cmpeqq stuff and we should have an update ready.. PS: it will be version 2.15 r3 as I don’t want to keep relying on just the date of the package to know if it’s the latest.

As soon as everything is 100% stable we’ll fix the version at 2.16 I think before continuing on with any future updates / features.

From: gwoltman [mailto:notifications@github.com] Sent: 01 October 2016 09:20 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: Re: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

vpxor must also reject zmm registers (i guess you'll need to check all the bitwise AVX2 operators)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35#issuecomment-250935212 , or mute the thread https://github.com/notifications/unsubscribe-auth/AQGQVE3x2Ol8wuPy9jhxTVciiaW8I2yGks5qvsBugaJpZM4KLcVC .

gwoltman commented 7 years ago

A question and a problem:

1) Any idea when the vpcmpeqq k3, zmm2, zmm1 problem will get fixed and r3 released?

2) Somehow we've introduced a crash bug in the last iteration or two. I just tried re-assembling all my code (not the new AVX512 stuff) and HJWASM r2 dated Oct 1 is crashing. Sorry, my attempts to narrow down the problem have been unsuccessful. I'm hoping you can review your most recent changes and something will stand out. If not, I can put together a zip file containing all you need to replicate the problem.

On Sat, Oct 1, 2016 at 4:53 PM, John Hankinson notifications@github.com wrote:

I’ve checked a few already and some others seem fine like por / vpor.. but I’ll go through and check others too. Vpxor is fixed now too, just waiting for Habran’s update for cmpeqq stuff and we should have an update ready.. PS: it will be version 2.15 r3 as I don’t want to keep relying on just the date of the package to know if it’s the latest.

As soon as everything is 100% stable we’ll fix the version at 2.16 I think before continuing on with any future updates / features.

From: gwoltman [mailto:notifications@github.com] Sent: 01 October 2016 09:20 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment < comment@noreply.github.com> Subject: Re: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

vpxor must also reject zmm registers (i guess you'll need to check all the bitwise AVX2 operators)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ Terraspace/HJWasm/issues/35#issuecomment-250935212 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ AQGQVE3x2Ol8wuPy9jhxTVciiaW8I2yGks5qvsBugaJpZM4KLcVC> .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35#issuecomment-250936943, or mute the thread https://github.com/notifications/unsubscribe-auth/AVPFM3YVvjs5z99L4Bk1NSkonHFFu6L-ks5qvshCgaJpZM4KLcVC .

john-terraspace commented 7 years ago

Hi,

That would be very helpful, hopefully the issue is related to something we’re already fixing!

We’ve been a bit quiet as we are completely re-factoring the code for handling AVX2 and AVX512.

We found many other issues apart from the ones you’d discovered. So to test we setup Intel SDE and have used the Gas/Nasm test case files.

In so doing we have also discovered avx512 bugs in both of those assemblers which we are correcting in hjwasm. We should have 2.16 out soon which will include the fully refactored avx2 and avx512 handling to not only fix the bugs but prevent misuse of register types and memory reference sizes where they simply just shouldn’t assemble at all in the first place.

As as it’s ready and we’ve re-run it through all the avx test-cases (including the samples you have provided) we will update.

Regards,

John

From: gwoltman [mailto:notifications@github.com] Sent: 25 October 2016 07:24 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: Re: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

A question and a problem:

1) Any idea when the vpcmpeqq k3, zmm2, zmm1 problem will get fixed and r3 released?

2) Somehow we've introduced a crash bug in the last iteration or two. I just tried re-assembling all my code (not the new AVX512 stuff) and HJWASM r2 dated Oct 1 is crashing. Sorry, my attempts to narrow down the problem have been unsuccessful. I'm hoping you can review your most recent changes and something will stand out. If not, I can put together a zip file containing all you need to replicate the problem.

On Sat, Oct 1, 2016 at 4:53 PM, John Hankinson notifications@github.com wrote:

I’ve checked a few already and some others seem fine like por / vpor.. but I’ll go through and check others too. Vpxor is fixed now too, just waiting for Habran’s update for cmpeqq stuff and we should have an update ready.. PS: it will be version 2.15 r3 as I don’t want to keep relying on just the date of the package to know if it’s the latest.

As soon as everything is 100% stable we’ll fix the version at 2.16 I think before continuing on with any future updates / features.

From: gwoltman [mailto:notifications@github.com] Sent: 01 October 2016 09:20 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment < comment@noreply.github.com> Subject: Re: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

vpxor must also reject zmm registers (i guess you'll need to check all the bitwise AVX2 operators)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ Terraspace/HJWasm/issues/35#issuecomment-250935212 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ AQGQVE3x2Ol8wuPy9jhxTVciiaW8I2yGks5qvsBugaJpZM4KLcVC> .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35#issuecomment-250936943, or mute the thread https://github.com/notifications/unsubscribe-auth/AVPFM3YVvjs5z99L4Bk1NSkonHFFu6L-ks5qvshCgaJpZM4KLcVC .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35#issuecomment-256123871 , or mute the thread https://github.com/notifications/unsubscribe-auth/AQGQVDWqfGKuWIsfQ70GZciUbk_aPpgXks5q3kkngaJpZM4KLcVC .

gwoltman commented 7 years ago

Hi,

Sounds like good things are happening.

Attached is (one of) the AVX asm routines that is causing an HJWASM crash. Should be mostly obvious. I run c64.bat, which sets PATH and calls "nmake compil64". Compil64 has the HJWASM command line.

Grrrr. Google won't attach the zip file. Here is the dropbox link: https://www.dropbox.com/s/ycnryd8ukojn5vi/xxx.zip?dl=0

Good luck, George

On Wed, Oct 26, 2016 at 2:20 PM, John Hankinson notifications@github.com wrote:

Hi,

That would be very helpful, hopefully the issue is related to something we’re already fixing!

We’ve been a bit quiet as we are completely re-factoring the code for handling AVX2 and AVX512.

We found many other issues apart from the ones you’d discovered. So to test we setup Intel SDE and have used the Gas/Nasm test case files.

In so doing we have also discovered avx512 bugs in both of those assemblers which we are correcting in hjwasm. We should have 2.16 out soon which will include the fully refactored avx2 and avx512 handling to not only fix the bugs but prevent misuse of register types and memory reference sizes where they simply just shouldn’t assemble at all in the first place.

As as it’s ready and we’ve re-run it through all the avx test-cases (including the samples you have provided) we will update.

Regards,

John

From: gwoltman [mailto:notifications@github.com] Sent: 25 October 2016 07:24 PM

To: Terraspace/HJWasm HJWasm@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment < comment@noreply.github.com> Subject: Re: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

A question and a problem:

1) Any idea when the vpcmpeqq k3, zmm2, zmm1 problem will get fixed and r3 released?

2) Somehow we've introduced a crash bug in the last iteration or two. I just tried re-assembling all my code (not the new AVX512 stuff) and HJWASM r2 dated Oct 1 is crashing. Sorry, my attempts to narrow down the problem have been unsuccessful. I'm hoping you can review your most recent changes and something will stand out. If not, I can put together a zip file containing all you need to replicate the problem.

On Sat, Oct 1, 2016 at 4:53 PM, John Hankinson notifications@github.com wrote:

I’ve checked a few already and some others seem fine like por / vpor.. but I’ll go through and check others too. Vpxor is fixed now too, just waiting for Habran’s update for cmpeqq stuff and we should have an update ready.. PS: it will be version 2.15 r3 as I don’t want to keep relying on just the date of the package to know if it’s the latest.

As soon as everything is 100% stable we’ll fix the version at 2.16 I think before continuing on with any future updates / features.

From: gwoltman [mailto:notifications@github.com] Sent: 01 October 2016 09:20 PM To: Terraspace/HJWasm HJWasm@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment < comment@noreply.github.com> Subject: Re: [Terraspace/HJWasm] Improper AVX512 code generation (#35)

vpxor must also reject zmm registers (i guess you'll need to check all the bitwise AVX2 operators)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ Terraspace/HJWasm/issues/35#issuecomment-250935212 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ AQGQVE3x2Ol8wuPy9jhxTVciiaW8I2yGks5qvsBugaJpZM4KLcVC> .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35#issuecomment-250936943, or mute the thread https://github.com/notifications/unsubscribe-auth/ AVPFM3YVvjs5z99L4Bk1NSkonHFFu6L-ks5qvshCgaJpZM4KLcVC .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ Terraspace/HJWasm/issues/35#issuecomment-256123871 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ AQGQVDWqfGKuWIsfQ70GZciUbk_aPpgXks5q3kkngaJpZM4KLcVC> .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Terraspace/HJWasm/issues/35#issuecomment-256434321, or mute the thread https://github.com/notifications/unsubscribe-auth/AVPFM80cZkUsTzmJQbQHpsRW7-HfliArks5q35oCgaJpZM4KLcVC .

john-terraspace commented 7 years ago

All changes are implemented, We have reviewed an immense list of AVX2 and AVX512 instructions and all seem good now. All missing instructions have been added. I have re-tested your above examples with objconv and there are no longer any errors.

V2.16 is now out :)

With regards to your crash, I have been able to successfully run/build using 2.16 here is the ouput:

D:\xxx>c64

D:\xxx>setlocal

D:\xxx>if exist "c:\program files (x86)" goto x64

D:\xxx>REM if exist "c:\program files (x86)\microsoft visual studio 10.0" goto x64_10

D:\xxx>if exist "c:\program files\microsoft visual studio 8" goto x64_8

D:\xxx>call "c:\program files\microsoft visual studio 9.0\vc\bin\amd64\vcvarsamd64" The system cannot find the path specified.

D:\xxx>goto makeit

D:\xxx>nmake /f compil64

Microsoft (R) Program Maintenance Utility Version 11.00.61030.0 Copyright (C) Microsoft Corporation. All rights reserved.

    HJWasm32 /c /DX86_64 /DLINUX64 /DARCH=FMA3 -win64 /Folinux64\factor64.obj factor64.asm

HJWasm v2.16, Nov 9 2016, Masm-compatible assembler. Portions Copyright (c) 1992-2002 Sybase, Inc. All Rights Reserved. Source code is available under the Sybase Open Watcom Public License.

factor64.asm: 4418 lines, 2 passes, 21 ms, 0 warnings, 0 errors d:\objconv\objconv -felf64 linux64\factor64.obj linux64\factor64.o

Input file: linux64\factor64.obj, output file: linux64\factor64.o Converting from COFF64 to ELF64

0 Debug sections removed 0 Exception sections removed attrib -r linux64\factor64.o d:\objconv\objconv -fmacho64 -nu+ linux64\factor64.obj macosx64\factor64.o

Input file: linux64\factor64.obj, output file: macosx64\factor64.o Converting from COFF64 to Mach-O Little Endian64 Adding leading underscores to symbol names

1 Debug sections removed 0 Exception sections removed 4 Changes in leading underscores on symbol names attrib -r macosx64\factor64.o del linux64\factor64.obj

D:\xxx>endlocal

D:\xxx>

john-terraspace commented 7 years ago

On another note, how are you getting on with the generated OSX objects? Have they worked out, I've still not had any time to look at getting the generated output working on OSX so interested to see how you have fared as it would make for a very valuable example to include in the hjwasm package.

gwoltman commented 7 years ago

Hi,

I've tried the new version and am having difficulties. I'll close this issue open a new one. Thanks for the hard work on addressing these concerns!

George