vyperlang / vyper

Pythonic Smart Contract Language for the EVM
https://vyperlang.org
Other
4.84k stars 789 forks source link

Unable to reproduce bytecode between contract generated from docker and binary #3369

Closed Enigmatic331 closed 1 year ago

Enigmatic331 commented 1 year ago

File in question: https://gist.github.com/Enigmatic331/5d85f73ec4b8e85ace1a2fa36c86790f

It seems that we are having issues reproducing the same bytecode for this contract when compiling between using the Vyper binaries and docker instances...

Compiled from vyper 0.3.7 binaries vyper037.exe CurveTricryptoOptimizedWETH.vy -f bytecode > output_037.txt, results are as such: https://gist.github.com/Enigmatic331/c58aeafd6a3a36f9791b22b4098aff43

Compiled from vyper 0.3.7 docker, docker run vyperlang/vyper:0.3.7 /code/CurveTricryptoOptimizedWETH.vy -f "bytecode" > docker_output_037.txt, results are as such: https://gist.github.com/Enigmatic331/085372a589a8509483b67b2977d6909d

Running the diffs: https://difff.jp/en/9bis6.html

Not sure if I might be missing anything - Since on both cases am compiling the contract without additional settings except to output "bytecode"..... Any idea what I might be missing? 🙏

Enigmatic331 commented 1 year ago

There is also an issue where we intermittently end up generating different bytecodes with minor differences (e.g. attached three outputs, all compiled using vyper037.exe CurveTricryptoOptimizedWETH.vy -f bytecode > filename) - Though I reckon this is secondary and resolving diffs between docker <> binary would also sort this out....

output_117PM.txt output_118PM.txt output_121PM.txt

pcaversaccio commented 1 year ago

I replicated the following behavior:

yolo:~$ docker pull vyperlang/vyper:0.3.7
yolo:~$ docker run -v ${PWD}:/code vyperlang/vyper:0.3.7 /code/WETH.vy -f "bytecode" > docker_output_037.txt

docker_output_037.txt

Left is binary and right is docker output: https://difff.jp/en/33ari.html

I don't have the same diffs as Etherscan (very weird) but there are some diffs. I used the same contract but just renamed it to WETH.vy.

charles-cooper commented 1 year ago

been digging into this -- apparently, the topsort order for function codegen is not stable. this means that sections of bytecode might be generated in different orders across runs (although observed behavior of the contract as a whole will be the same). the underlying reason for this seems to be, while dict objects guarantee insertion order since python 3.7, set makes no such guarantee, and sets are used in computing the call graph as of 4b44ee7bcc3d9dba74329cf35436a267e4dafa8: https://github.com/vyperlang/vyper/blob/4b44ee7bcc3d9dba74329cf35436a267e4dafa87/vyper/semantics/types/function.py#L126

it might be possible to force the insertion order to be stable by fixing PYTHONHASHSEED, but the underlying fix will be to replace any uses of set with OrderedSet.

Enigmatic331 commented 1 year ago

Thank youuuu Charrleeesss aaaaaaaaa ❤️

Yea I would reckon setting PYTHONHASHSEED helps - Though a quick run yesterday by setting it on our Environment Variables didn't seem to generate a stable bytecode across two machines (despite during python runtime print(os.environ["PYTHONHASHSEED"] showed 0) - Not too sure what's the deal with that but we will be digging around this a bit more.

c-protocol-team commented 1 year ago

Thank you very much indeed, Charles. Please note that until a new version of the compiler is released source verification on etherscan of some vyper contracts will continue to fail.