Open serapath opened 5 years ago
change of mind:
// ------------------------
// RAW DATASET
sourcecode/xx240...95k.sol
sourcecode/zu240...892.sol
sourcecode/zu240...892.json = [0x123..., 0x3535..., ...]
sourcecode/zu240...892.json = [0x123..., 0x3535..., ...]
// transactions
transactions/0x...
// wallets
wallets/0x...
// contracts
contract/0x457....2218.json = {}
contract/0x457....2218.json
contract/0x457....2218.json
contract/0x6fs....2218.json
contract/0x48h....3456.json
contract/0x457....2218.json
// @TODO: optimization behind the scenes for later
contract/0x457....2218#compiler.version = 0.4.23
contract/0x457....2218#compiler.optimized = 1
// ------------------------
=> INDEX HYPERTRIE to link to all particular index hypertries (e.g. one for sol> 0.5)
// INDEXES
// contracts
zu240...892 = [0x6f..., 0x48h..., ....]
// solidity > 0.5
zu240...892 = [xx240..., zu240..., ....]
// ------------------------
=> ANNOTATIONS HYPERTRIE to link to all particular annotation hypertries (e.g. created by OpenZepplin OR audited by MythX)
// ANNOTATIONS
audited = [xx240...sol, x53f...sol, ...]
db.put('sourcecode/xx240...95k/source.sol', `contract Foo { ...sourcecode ... }`)
db.put('sourcecode/xx240...95k/0x457....2212.json', {
compiler: 'v5.0.0',
filepath: '...',
imports: {
}
})
db.get('sourcecode/xx240...95k/source.sol', (err, val) => console.log(val))
ite = db.iterator('sourcecode/', [options])
stream = db.createReadStream(prefix, [options])
db.list(prefix, [options], callback)
@todo
datdot-tcir#2
: AST import statement analysis and content hash replacementdatdot-tcir#2
: AST import statement analysis and content hash replacement to re-write all contract data into the correct formatproblem1 A lot of contracts might contain more than one contract definition and they need to be split into multiple files, each with only a single contract definition. In the end they are re-added my adding the correct import statements, so that the result works in the same way than the single file with multiple contract definitions
problem2 Two people can compile and deploy the same contract.
/user/serapath/solidity/foo.sol
with importing/user/serapath/solidity/bar.sol
/user/chriseth/solidity/foo.sol
importing/user/christeth/solidity/bar.sol
The compiler will output two different bytecode hashes, so it is hard to verify the contract if you only have the source code, but don't know what was the filepath from which it was compiled.
correct format (example)
sourcecode/xx240gjwspgj4j95K/
(=hash ofsource.sol
where all imports are hash names)source.sol
0x457dsaf704e39fc728058557a113b226207d2212/
contract.json
(= deployed address of Foo instance)0xnv7d56704e39fc728058557a113b226207d2212/
contract.json
(= another deployed address of Foo instance)sourcecode/42j490gj294jg2094jg9w0jg4/
source.sol
sourcecode/v2240gjwspgj4j94w/
source.sol
sourcecode/79g79g79f8f6j6f/
source.sol
0xd556caf704e39fc728058557a113b226207d2212/
contract.json
0xa64sscaf704e39fc728058557a113b226207d2212/
contract.json