google / heir

A compiler for homomorphic encryption
https://heir.dev/
Apache License 2.0
320 stars 47 forks source link

`//tests/cggi:straight_line_vectorizer.mlir.test` flakes #519

Open j2kun opened 7 months ago

j2kun commented 7 months ago

It appears that //tests/cggi:straight_line_vectorizer.mlir.test flakily fails, I believe due to arbitrariness in the graph algorithm topo-sorting.

github-actions[bot] commented 7 months ago

This issue has 2 outstanding TODOs:

This comment was autogenerated by todo-backlinks

ai-mannamalai commented 2 months ago

note sure if this is still a bug:

while [ 1 -gt 0 ]; 
 do
 heir-opt --straight-line-vectorize tests/cggi/straight_line_vectorizer.mlir | FileCheck tests/cggi/straight_line_vectorizer.mlir 
 echo $?
 sleep 1
done

this loop continues w/o error.

However, the following shows an error from the canonicalize work I had introduced,

heir-opt --canonicalize --straight-line-vectorize tests/cggi/straight_line_vectorizer.mlir 
tests/cggi/straight_line_vectorizer.mlir:41:9: error: 'cggi.lut_lincomb' op operand #0 must be variadic of A type for LWE ciphertexts, but got 'tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>'
  %11 = cggi.lut3 %10, %extracted_5, %3 {lookup_table = 105 : ui8} : !ct_ty
        ^
tests/cggi/straight_line_vectorizer.mlir:41:9: note: see current operation: %34 = "cggi.lut_lincomb"(%31, %32, %33) <{coefficients = array<i32: 1, 2, 4>, lookup_table = 105 : ui8}> : (tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>, tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>, tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>) -> tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>
tests/cggi/straight_line_vectorizer.mlir:73:9: error: 'cggi.lut_lincomb' op operand #0 must be variadic of A type for LWE ciphertexts, but got 'tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>'
  %r1 = cggi.lut3 %0, %1, %2 {lookup_table = 8 : ui8} : !ct_ty
        ^
tests/cggi/straight_line_vectorizer.mlir:73:9: note: see current operation: %21 = "cggi.lut_lincomb"(%18, %19, %20) <{coefficients = array<i32: 1, 2, 4>, lookup_table = 8 : ui8}> : (tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>, tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>, tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>) -> tensor<6x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>
tests/cggi/straight_line_vectorizer.mlir:120:9: error: 'cggi.lut_lincomb' op operand #0 must be variadic of A type for LWE ciphertexts, but got 'tensor<4x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>'
  %r1 = cggi.lut3 %0, %1, %2 {lookup_table = 8 : ui8} : !ct_ty
        ^
tests/cggi/straight_line_vectorizer.mlir:120:9: note: see current operation: %19 = "cggi.lut_lincomb"(%16, %17, %18) <{coefficients = array<i32: 1, 2, 4>, lookup_table = 8 : ui8}> : (tensor<4x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>, tensor<4x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>, tensor<4x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>) -> tensor<4x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>

perhaps a separate issue or reopen #877 ? Seems like I need to introduce the tensor extract at index like,

 func.func @require_post_pass_toposort_lut2(%arg0: tensor<8x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>) -> !lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>> {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %extracted = tensor.extract %arg0[%c0] : tensor<8x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>
    %extracted_0 = tensor.extract %arg0[%c1] : tensor<8x!lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>>
    %0 = cggi.lut_lincomb %extracted, %extracted_0 {coefficients = array<i32: 1, 2>, lookup_table = 8 : ui8} : <encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>
    return %0 : !lwe.lwe_ciphertext<encoding = #lwe.unspecified_bit_field_encoding<cleartext_bitwidth = 3>>
  }
asraa commented 2 months ago

Woah, nice find! I actually think that was because lut_lincomb's input specification wants Variadic<LweCiphertext> and doesn't allow tensors of LWE ciphertexts. I have a fix PR here allowing container types: https://github.com/google/heir/pull/939

ai-mannamalai commented 2 months ago

Thanks for the fix!

From: asraa @.> Date: Monday, August 26, 2024 at 6:16 AM To: google/heir @.> Cc: Muthu Annamalai @.>, Comment @.> Subject: [EXTERNAL] Re: [google/heir] //tests/cggi:straight_line_vectorizer.mlir.test flakes (Issue #519)

Woah, nice find! I actually think that was because lut_lincomb's input specification wants Variadic and doesn't allow tensors of LWE ciphertexts. I have a fix PR here allowing container types: #939https://github.com/google/heir/pull/939

— Reply to this email directly, view it on GitHubhttps://github.com/google/heir/issues/519#issuecomment-2310193658, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APGLTDGEG2JHLFTA632RBG3ZTMTD7AVCNFSM6AAAAABERHH67KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJQGE4TGNRVHA. You are receiving this because you commented.Message ID: @.***>

ai-mannamalai commented 1 month ago

Current reproduction steps fail and I suspect the bug is not manifest anymore; should you consider to close it ?

asraa commented 1 month ago

I think we might want to keep it. It's true that I made a workaround so the test doesn't flake anymore (https://github.com/google/heir/blob/main/tests/cggi/straight_line_vectorizer.mlir#L93-L97), but the test isn't as robust. The straight line vectorizer is still nondeterministic.

So actually the issue really is if we want to keep straight-line-vectorizer as non-deterministic or not. The test is fixed so that it allows for that to happen. @j2kun is this issue OK to close (i.e. vectorizer is non-determinisitc is fine)?

ai-mannamalai commented 1 month ago

Okay we can take a look at if some inherent STL unordered containers are leaking this non determinism as a easy source of bug and apply a fix

Get Outlook for iOShttps://aka.ms/o0ukef


From: asraa @.> Sent: Thursday, August 29, 2024 7:49:34 AM To: google/heir @.> Cc: Muthu Annamalai @.>; Comment @.> Subject: [EXTERNAL] Re: [google/heir] //tests/cggi:straight_line_vectorizer.mlir.test flakes (Issue #519)

I think we might want to keep it. It's true that I made a workaround so the test doesn't flake anymore (https://github.com/google/heir/blob/main/tests/cggi/straight_line_vectorizer.mlir#L93-L97https://github.com/google/heir/blob/main/tests/cggi/straight_line_vectorizer.mlir#L93-L97), but the test isn't as robust. The straight line vectorizer is still nondeterministic.

So actually the issue really is if we want to keep straight-line-vectorizer as non-deterministic or not. The test is fixed so that it allows for that to happen. @j2kunhttps://github.com/j2kun is this issue OK to close (i.e. vectorizer is non-determinisitc is fine)?

— Reply to this email directly, view it on GitHubhttps://github.com/google/heir/issues/519#issuecomment-2317951220, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APGLTDCRSZ5LTN2JKU3UEYDZT4YH5AVCNFSM6AAAAABERHH67KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJXHE2TCMRSGA. You are receiving this because you commented.Message ID: @.***>

asraa commented 1 month ago

Okay we can take a look at if some inherent STL unordered containers are leaking this non determinism as a easy source of bug and apply a fix

Definitely only if it's easy! I don't think it's super high priority for this pass to be deterministic