ethereumjs / ethereumjs-monorepo

Monorepo for the Ethereum VM TypeScript Implementation
2.57k stars 744 forks source link

EVM/Common -> Tree Shaking Refactor: Separate Common EVM Gas Prices from other Params #3488

Closed holgerd77 closed 1 month ago

holgerd77 commented 1 month ago

Part of #3446

Earliest start together with official breaking release work!

Common remains to be a thing with its 179 KB in size, which is a lot for a configuration library, also referencing the other Common issue #3448 here along. So it will be hard to substantially bring the 300 KB or so from Tx - other numbers but somewhat similar with Block - if we do not substantially bring Common down.

grafik

The vast chunk of the parameters actually make up the gas prices for the EVM (>70% on first estimate). I first thought that we should fully move to EVM, turns out to be too difficult on second look though with all the which-gas-price-at-which-hardfork logic baked into Common which we would otherwise need to replicate.

What we can do instead though is to separate VM/EVM and non-VM/EVM params in Common itself a bit more by doing two separate eips.ts and eipsEVM.ts (or so) as well as hardforks.ts and hardforksEVM.ts files (or, alternative, maybe also enough: at least do separate data structures EIPsDict and EIPsDictEVM or so).

Then we can do separate constructors like createCommon() (#3487 already baked in here! 🙂) and createCommonWithEVM().

This would already allow to tree-shake things out for Block/Tx/Blockchain users and so cut out 20-30KB from the 40KB eips.ts and hardforks.ts together.

There might be better solutions out there, open for suggestions (one would e.g. need to be a bit more careful that a non-EVM common does not make it into the EVM, I would suggest a boolean flag in Common isEVMCommon or so and a direct check everytime a Common is set in EVM/VM with directly throwing so that this is in doubt detected early).

But the proposed solution would already substantially improve the situation.

holgerd77 commented 1 month ago

After seeing how much this is condensed in #3500 from @scorbajio (see e.g. https://github.com/ethereumjs/ethereumjs-monorepo/blob/7bf3771210e9c353e1d44c65c289c6428fffb0d6/packages/common/src/hardforks.ts) I wonder if this change I proposed here is really worth the effort (respectively I guess I am rather on the not worth it side already), since this would introduce significant double structures - also with some additional bloat - and the "new bloat to win" ratio would still be positive but also not sooo super big.

What might be worth to still do instead is to move all three comment, url and status fields to a unifying code comment above hardforks and EIPs, a bit like in the work from the PR from Amir referenced.

This reduction would actually again be somewhat significant (for the low-level amount of work), and - I guess ? - there should be no side effects around (no one using this - again - programatically, right?).

Maybe someone confirms though before we do.

jochem-brouwer commented 1 month ago

I think, from a tree shaking perspective, that it still makes sense to move out all the parameters which are used by EVM into a new file. For instance, if you use the tx library, then you will need Common, but you don't want to import all the gas parameters into there.

I agree that the comment/url/status should indeed be moved into a comment. I don't think anyone uses it programatically.

holgerd77 commented 1 month ago

I agree that the comment/url/status should indeed be moved into a comment. I don't think anyone uses it programatically.

Ok, I would like that we do that first before we proceed here, then we have a better overview on the size relations. Still not sure if we should really do.

Then we would need to (at least partly) double structures like:

1153: {
    comment: 'Transient storage opcodes',
    url: 'https://eips.ethereum.org/EIPS/eip-1153',
    status: Status.Review,
    minimumHardfork: Hardfork.Chainstart,
    requiredEIPs: [],
    gasPrices: {
      tstore: {
        v: 100,
        d: 'Base fee of the TSTORE opcode',
      },
      tload: {
        v: 100,
        d: 'Base fee of the TLOAD opcode',
      },
    },
  }

This would actually also introduce many new lines of code, also additional complexity. Should really look closely here.

jochem-brouwer commented 1 month ago

Ah wait, I think I got the idea wrong. Why does the code have access to comment, url, status? The comment and url, can't these be commented out? (And possibly status too?). It would be the same argument as why we have moved the d values of the params to a comment (instead of making it accessible to code).

EDIT: nvm, I got the idea correct. I just don't see what "double structures" we would get here?

holgerd77 commented 1 month ago

We without doubt can do the comment, url, status thing.

The double structures we would get for the EVM separation thing.

So we would need to add separate dictionaries for all (?) EIPs e.g., one for the basic stuff, one for EVM values (like the 1153 one I posted above).

Then we would need to decide where to put minimumHardfork and requiredEIPs (the base one likely). Then we would need to add/adjust types for the EVM EIP dicts. Then we would need new logic to crawl both through the base and the EVM EIP files.

That's all overhead. Maybe not dramatic. But not sure if it's worth.

So I would suggest to first do the simple thing (comment, url, status) and see where we are at. 🙂

acolytec3 commented 1 month ago

Is this one complete?

holgerd77 commented 1 month ago

Yes.