Running on contracts with external dependencies

njelich commented 1 year ago

Currently smartbugs works fine on contracts with clear direct dependencies, but what about contracts importing from "@openzeppelin" "@chainlink" etc as part of node_modules and similar.

Should these cases be handled directly within honeypots or should a resolver for this work externally?

jff commented 1 year ago

SmartBugs delegates the analysis to the user-specified analysis tools. I believe that it's up to each tool to decide whether external dependencies are considered.

Are there any analysis tools that support this?

njelich commented 1 year ago

For example slither has a --solc-remaps settings:

https://github.com/crytic/slither/issues/1121

And is also development environment aware (from docs):

Run Slither on a Truffle/Embark/Dapp/Etherlime/Hardhat application:

slither .

These remapping can be implemented at the solc level for all the tools: https://ethereum.stackexchange.com/questions/74448/what-are-remappings-and-how-do-they-work-in-solidity

njelich commented 1 year ago

Similarly for mythril:

https://github.com/ConsenSys/mythril/issues/107

jff commented 1 year ago

One way to support this is to enable a parameter that is sent to solc for all the execution tasks. For example,

./smartbugs -t all -f samples/simple_dao.sol --json --solc-options="--bin @openzeppelin/contracts-ethereum-package=/Your/Absolute/Path/To/@openzeppelin/contracts-ethereum-package"

Would this be convenient for your use case, @njelich ?

@gsalzer Can you think of any other use cases where this could be useful?

njelich commented 1 year ago

This is the general approach, I'd just want to make sure it works for all the analyzers correctly. Do they all support solc mappings?

SeherSaylik commented 1 year ago

Here are the analysis of tools regarding env/remapping support. Tools that use crytic-compile support most of the dev environments.

	version	DevEnv aware	Remappings	Flattened code
ConFuzzius	#4315fb7 v0.0.1			:heavy_check_mark:
Conkas	#6aee098			:heavy_check_mark:
Ethainter
eThor	2021 (CCS 2020)
HoneyBadger				:heavy_check_mark:
MadMax	#6e9a6e9
Maian	#4bab09a			:heavy_check_mark:
Manticore	0.3.7	:heavy_check_mark:	:heavy_check_mark:	:heavy_check_mark:
Mythril	0.23.5		:heavy_check_mark:	:heavy_check_mark:
Osiris				:heavy_check_mark:
Oyente	#480e725	:heavy_check_mark:		:heavy_check_mark:
Pakala	#c84ef38 v1.1.10
Securify				:heavy_check_mark:
Slither		:heavy_check_mark:	:heavy_check_mark:	:heavy_check_mark:
Smartcheck		:heavy_check_mark:		:heavy_check_mark:
Solhint	2.1.0			:heavy_check_mark:
teEther	#04adf56
Vandal	#d2b0043

gsalzer commented 1 year ago

We have to keep in mind that the processes running inside the container don't have access to the file system outside. Below, the first option seems to be easier to realize but doesn't quite capture the original intent (I think). The second option seems to be what @njelich has in mind, but it is not straightforward.

Option 1: The tool operates in bytecode/runtime mode. In this case, SmartBugs might compile the contract outside of the container, using whatever settings the user provides (though they should be the same for all contracts in a single run, like some include-path), and then copy only the bytecode/runtime code into the container. This is an extension I have already been thinking of, as this would allow the user to analyze Solidity code even for tools not supporting it explicitly, with the penalty that the findings are not presented in the context of the source code. (Well, the findings parser might be able to use the source mapping to calculate line numbers from bytecode addresses.)

Option 2: The tool operates in source code mode, needs to compile the source itself, and the source consists of more than a single file. The problem here is how to transfer the compilation environment into the Docker container. Neither the compiler nor the tool are able to access files outside of the container when run inside, so all imported files need to be inside, maybe requiring remapping, as the imported files may end up in a place different from the outside. A solution might be to support sol-json in addition to pure sol files. sol-json is an input format accepted by solc where all required files as well as all settings are provided in a single json file. Tools that do not look into the sol file themselves can be tricked into processing it by making solc a wrapper script that calls the true solc with appropriate options (and doing any other pre- and post-processing). One problem here is that the user has to provide the source as sol-json. There might be tools accomplishing this; in this case SmartBugs might take care of the packaging as well. Another problem are tools that process Solidity sources themselves, without using solc. If they do not know how to handle sol-json, it will be tricky to emulate the world outside within the container, while preserving import pathes.

njelich commented 1 year ago

I'm currently leaning more towards Option 1:

Support for the bytecode only tools (Ethainter, eThor, MadMax, Pakala, teEther, Vandal).
Easy to implement source mapping, without relying on any of the tools actually supporting it.
Since we are already parsing the solc output, we can now add a smartbugs level exclude pattern. (E.g. don't show results for OZ/Chainlink etc contracts, since they are out of scope)

With that in mind, would it make sense to add something like CryticCompiler as top level to smartbugs? It's actively maintained and staying on top of the compiler and dev env interaction management.

gsalzer commented 1 year ago

@njelich re CryticCompiler: I will check. Being actively maintained is certainly a plus, like is the support of dev envs. There are some special requirements of SmartBugs, like cross-compilation (e.g., SmartBugs on Mac needs Linux solc inside the container), that may require adaptions. Re exclude patterns: Some tools actually support specifying the name of the contract to analyze. It just isn't supported by SB yet, but makes much sense: why waste comp time on analyzing irrelevant contracts?

gsalzer commented 1 year ago

@SeherSaylik Thanks for the useful survey!

njelich commented 1 year ago

Thanks @SeherSaylik, yea CryticCompiler is an eye opening option.

@gsalzer For some things we cannot skip analyzing - a NFT contract that extends the OZ ERC721 implementation will of course include the OZ code in it.

I was thinking and talking also about the ease of filtering results. The source mapping is very powerful and standardized, and allows us to remove the many false positive results that show for any big library (like OZ)

njelich commented 1 year ago

@gsalzer @jff Can you point us towards some steps how we can help implement these exclusions? Maybe the place where the code to filter out the results would best belong?

gsalzer commented 1 year ago

@njelich Not yet sure that I understand completely what kind of filtering you are thinking of. Do you want to filter the output of those tools that process source code and include line numbers already now, or are you thinking of tools that, after the enhancement above, will handle source code thanks to SmartBugs' (future) pre/postprocessing? In the first case, the json/sarif output of Smartbugs contains file names and line numbers (if the tool provides it).

Ideally, any pre-/postprocessing that requires non-Python components should happen inside the container, to be portable across OSs. Here are the hooks that are currently available.

Preprocessing outside: For source files, SB extracts the solc version and loads a solc for Linux that is later copied into the container. Analysing the pragma and loading solc is done in sb/smartbugs.py:collect_tasks(), copying the compiler happens in sb/docker.py:__docker_volume(). There are no explicit hooks to extend the preprocessing outside.
Preprocessing inside: Can be done by modifying the do_* scripts in the tools/<toolid>/scripts folder. The scripts have access to the analyzed file, the solc compiler, and everything inside the container.
Postprocessing inside: Like the preprocessing, in the do_* scripts. These scripts do the preprocessing inside, call the tool, and then can do postprocessing.
Postprocessing outside: The output of the tool is fed to tools/<toolid>/parser.py. This python script has access to stdout/stderr of the container and to all files in the designated output folder of the container (see output option in tools/<toolid>/config.yaml).

What ever can be done by modifying the files in tools/<toolid> is rather part of the tool configuration and does not touch SB. Any additional preprocessing outside requires modifications of SB.

jff commented 1 year ago

@njelich It would be helpful if you could provide an end-to-end example illustrating the feature (from the CLI command to the expected result). At the moment, I do not fully understand the envisaged feature/filtering. Also, would this be only for tools that support bytecode?

yudikubota commented 1 year ago

I have encountered this problem too. What I'm doing to mitigate this is preprocessing the contracts outside Smartbugs. I wrote a python script that takes Etherscan output and resolves the dependency graph to flatten the source code. It works for 96% of contracts in the updated wild dataset.

This method is not ideal for the reasons mentioned above but is the best I could do right now to process complex contracts.

It would be nice to integrate this preprocessing script (and others) into Smartbugs.

faizzaidi commented 1 year ago

Hi everyone,

I am new in web3sec; I have faced the same issue while working with contracts using OpenZeppelin chainlink. I tried to use any open-source scripts for flattening the code, i.e., https://github.com/ConsenSys/surya, but it did not work for me. I am getting errors like the below one:

{"success": false, "error": "Solc experienced a fatal error.\n\nParserError: Expected pragma, import directive or contract/interface/library/struct/enum/constant/function/error definition.\n --> /sb/ICO_flat.sol:1:1:\n |\n1 | / The following code is from flattening this file: contracts/ICO.sol\n | ^\n\n", "issues": []}.

I would appreciate it if you could help with this or provide a doc for smartbugs usage. I have gone through the smartbugs wiki and https://fenix.tecnico.ulisboa.pt/downloadFile/1689244997259999/80990-antonio-monteiro_dissertacao.pdf.

gsalzer commented 1 year ago

@faizzaidi Please open a separate issue and add a (flattened) input file that results in such an error (ideally a rather small one). Otherwise it is difficult to guess what's going wrong. Basically, if you supply a single Solidity file that compiles it should work. This is independent from the request to extend SmartBugs that it does the flattening itself (or handles includes in some other way).

smartbugs / smartbugs

Running on contracts with external dependencies #143