lightbulb128 / troy-nova

GPU/CUDA implementation of Leveled BFV/CKKS/BGV scheme.
20 stars 6 forks source link

Python Test Failures #10

Open Creling opened 2 months ago

Creling commented 2 months ago

Hi,

I encountered a problem where nearly all Python tests fail in my environment, including some basic operations.

image

I suspect the issue might be related to the decode function.

# test_he_operations.py

def test_add_sub(self):
    ghe = self.ghe
    message1 = ghe.random_simd_full()
    message2 = ghe.random_simd_full()
    plain1 = ghe.encoder.encode_simd(message1)
    plain2 = ghe.encoder.encode_simd(message2)
    cipher1 = ghe.encryptor.encrypt_symmetric_new(plain1, False)
    cipher2 = ghe.encryptor.encrypt_symmetric_new(plain2, False)
    added = ghe.evaluator.add_new(cipher1, cipher2)
    decoded = ghe.encoder.decode_simd(ghe.decryptor.decrypt_new(added))

    print(f"message1: {message1}")
    print(f"message2: {message2}")
    print(f"decoded: {decoded}")

'''
outputs:

message1: [ 105270  500069  262731  803957  543788  756142  244039 1036853   65449
  229446  656679   88176  384052  776630  392070  912476  543635  258704
  999613  962219  939314  428918  597307  417538  478815 1037033   92833
  119667  836193  217219  852762   98650]
message2: [ 199749  421437  145283  560030  414781  466479  338126  535338   96947
  902743 1028024  940739  261747  622940  455635  888047  416033  356917
  242463  823168  747736  339538  860854  887122  584590  205416  329272
  430375  342742   29770    5727  702475]
decoded: [305019 305019 305019 305019 305019 305019 305019 305019 305019 305019
 305019 305019 305019 305019 305019 305019 305019 305019 305019 305019
 305019 305019 305019 305019 305019 305019 305019 305019 305019 305019
 305019 305019]
'''

My environment details are as follows:

Python 3.11.9
CMake  3.30.2
Nvidia Driver 545.23.08    
CUDA 12.3
RTX 3090

P.S. I have added the cmake flags mentioned in https://github.com/lightbulb128/troy-nova/issues/1, the C++ tests work well.

image

lightbulb128 commented 2 months ago

What are the 12 tests that passed in python's he_operations? If the problem is related to the GPU device, I would expect all the host tests to have passed.

And could you paste the error log for the "SerializeTest.DeviceCKKSCiphertext" unittest in googletest, which you showed in the image above?

Creling commented 2 months ago

Well, all decode related tests fails, regardless of whether they are run on the host or device.

test_extract_lwe (__main__.HostBFVHeOperations.test_extract_lwe) ... ok
test_setup_ok (__main__.HostBFVHeOperations.test_setup_ok) ... ok
test_extract_lwe (__main__.HostCKKSHeOperations.test_extract_lwe) ... ok
test_setup_ok (__main__.HostCKKSHeOperations.test_setup_ok) ... ok
test_extract_lwe (__main__.HostBGVHeOperations.test_extract_lwe) ... ok
test_setup_ok (__main__.HostBGVHeOperations.test_setup_ok) ... ok
test_extract_lwe (__main__.DeviceBFVHeOperations.test_extract_lwe) ... ok
test_setup_ok (__main__.DeviceBFVHeOperations.test_setup_ok) ... ok
test_extract_lwe (__main__.DeviceCKKSHeOperations.test_extract_lwe) ... ok
test_setup_ok (__main__.DeviceCKKSHeOperations.test_setup_ok) ... ok
test_extract_lwe (__main__.DeviceBGVHeOperations.test_extract_lwe) ... ok
test_setup_ok (__main__.DeviceBGVHeOperations.test_setup_ok) ... ok
Creling commented 2 months ago

"SerializeTest.DeviceCKKSCiphertext" fails in line 140:

https://github.com/lightbulb128/troy-nova/blob/c89a8980c2b266d9fe82f69b03f52b62abaecf5f/test/serialize.cu#L134-L141

Value of: truth.near_equal(decoded, tolerance)
  Actual: false
Expected: true

--- update ---

[ RUN      ] SerializeTest.DeviceCKKSCiphertext
truth: [(-22.0926,27.1982), (24.969,-97.0889), (22.6202,-88.7042), (37.0248,40.4353), (7.0871,15.3819), (25.9213,57.6711), (-5.56384,-123.845), (-55.8037,30.8642), (-83.234,28.3502), (64.2196,-63.6204), (29.8523,3.06689), (-29.6879,19.0774), (18.5548,0.291333), (2.41882,11.391), (-17.6762,66.9577), (-4.72304,4.5837)]
decoded: [(-22.0964,27.1992), (24.9733,-97.0927), (22.6308,-88.7061), (37.026,40.4334), (7.08572,15.3827), (25.9252,57.6718), (-5.56762,-123.846), (-55.8013,30.8707), (-83.2311,28.3503), (64.2216,-63.624), (29.8532,3.06774), (-29.6862,19.0821), (18.5568,0.294502), (2.41908,11.3896), (-17.6835,66.9577), (-4.72335,4.58403)]
/.../troy-nova/test/serialize.cu:142: Failure
Value of: truth.near_equal(decoded, tolerance)
  Actual: false
Expected: true

[  FAILED  ] SerializeTest.DeviceCKKSCiphertext (9 ms)
lightbulb128 commented 2 months ago

"SerializeTest.DeviceCKKSCiphertext" seems unrelated to python's failing where I just need to tune up the tolerance.

Don't know why exactly the python decoding are failing. From the output, I see the decoded vector has all the same value, and the value equals the addition of the first elements in the input vectors. Probably I should check this with exactly the same environment as yours.

Creling commented 2 months ago

Yes, "SerializeTest.DeviceCKKSCiphertext" is unrelead to this issue.

For the decoding problem, what is your environment?

lightbulb128 commented 2 months ago

My environment:

Creling commented 2 months ago

My environment:

  • NVIDIA GeForce RTX 4090
  • CUDA 12.4, drivers 550.54.14
  • g++/gcc 11.4
  • Ubuntu 20.04
  • Python 3.8.10
  • cmake 3.28.1

I test in Python 3.8 and it works, all tests are ok.

Creling commented 2 months ago

Thank you for your help.

lightbulb128 commented 4 weeks ago

Reproduced this issue with python3.10.15. Solve: seems pybind11 version is too old. Updating from pybind11 v2.11 to v2.13 solved the reproduction of the issue (#14).