ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.17k stars 5.75k forks source link

Lack of proper heuristics in the Function Specializer can lead to large contract sizes #14893

Open kaber2 opened 7 months ago

kaber2 commented 7 months ago

Description

I have a smart contract where the majority of the functionality has one private function as entry point, and there are two external entry points. Solidity currently inlines the private function (including everything it calls) into both of the external entry points. I now wanted to add a third external entry point, and solidity inlines the entire private function yet again, making the contract blow up by almost 50% and exceeding the blockchain limit.

This happens with via-ir at any optimization run count, even when going down to zero. Turning off optimization is not an option for various reasons, but I also don't want to turn off inlining completely as I have many functions which are designed to be inlined. I would expect the compiler to pay some consideration to the size of the function being inlined and not blow up the contract to almost 3x the necessary size just to avoid a jump.

Is there anything I can do to make the compiler stop inlining this very large function, while still inlining small functions?

Environment

cameel commented 7 months ago

That's something that's not supposed to be happening. Looking at the current inlining heuristics, the only situation when we'd inline a huge function is when it's used once.

The only thing that could explain this that comes to mind is the FunctionSpecializer step (F), which may create a copy of the function if you call it with a literal. This could lead to multiple copies of the function that are then being inlined by FullInliner (i). The inliner runs before the specializer but that part of the sequence is repeated multiple times so those specialized functions can still get inlined on the next pass.

You could try to verify this by using a custom sequence with F removed. And if that's the case, a temporary workaround would be to call these functions with something that's not a literal to bypass the step. Putting it in a variable might do the trick, but you might also need something more complicated since the optimizer can still to a large extent deduce that something evaluates to a literal.

As for the long-term solution, maybe we should reconsider whether specializer is worth it if it can run into such corner cases. Or put some limits on it. I'm actually working on improving the default sequence right now and I can check check what the effect of removing the specializer is in practice, but would be best if you could confirm this first.

Do you have a repro for this? If it's of a reasonable size, I could even include it as another test contract in my benchmark.

kaber2 commented 7 months ago

Thank you for the hint, I will give this is try and report back.

kaber2 commented 7 months ago

PS: regarding Repro, unfortunately no and I can't share this specific contract. I'll see if your suggestions makes a difference and will then see if I can quickly whip up something.

kaber2 commented 7 months ago

Ok I used the default options from OptimiserSettings.h and removed F, and indeed that massively reduced my contract size. I also tried adding an intermediate function to remove constant parameters for the large function, but as you suspected, the optimizer realized that and it had no effect.

I'll play around a bit and see if I can find something else worthwhile reporting.

Thanks again for your help!

kaber2 commented 7 months ago

test.txt

I think I managed to create a reproducer. It has one big lnonsense function, and two external entry points, calling the function with constant arguments. It gets specialized for both callers, unless I remove "F" from the optimizer options.

While probably not minimal, I hope that helps as a starting point.

leonardoalt commented 6 months ago

I think I have the same issue. Sol code: https://gist.github.com/leonardoalt/382603dec4da0425d073ffeed9356e7c

The code above compiled with 0.8.25 and --via-ir leads to ~13k code size. With --optimizer that number goes to ~31k (above contract size limit).

I talked to @nikola-matic and we tried a few different things. I tried the optimizer sequence "dhfoDgvulfnTUtnIf[xa[r]EscLMcCTUtTOntnfDIulLculVcul [j]Tpeulxa[rul]xa[r]cLgvifCTUca[r]LSsTOtfDnca[r]Iulc]jmul[jul] VcTOcul jmul:fDnTOcmu" (without F) and it didn't help.

Here's a smaller contract where a similar thing happens (~4k to ~6k code size without/with optimizer): https://gist.github.com/leonardoalt/74fa6d3a05c9282638514c7e29760df9, though I think the first one I posted is more relevant.

cameel commented 6 months ago

I experimented with the specializer in the new sequence a bit and unfortunately this is tricky. There's a complex interaction between the specializer and the inliner and how good the result is often depends on which one is first to get code transformed enough to act on it.

For example here's an example of a 10% increase in runtime gas cost when the specializer does not eliminate a literal argument from a function called in a tight loop. The tricky bit is that the best version is one where calls to the function outside of the loop are inlined and the ones in the loop are only specialized. The current sequence somehow strikes that balance, my new one does not.

I managed to address that by adjusting the new sequence, which lowered gas use in this example but then bytecode size in another example increased by 8%. This time because the specializer duplicated a function containing large constants. The interesting thing is that originally the new sequence was beating the current one by a large margin on this example because it was not duplicating the function at all (current sequence duplicates it once). With the adjustment it started duplicating it twice.

I also tried removing F from the current sequence and it does not usually improve results. Specialization seems important, just has to be used at the right time.

Overall my impression is that this is something that will be hard to address by just rearranging the sequence. We gain in some examples, lose in others. We do need some heuristics inside the specializer to decide when it's worth it.

kaber2 commented 6 months ago

I'm not sure how practical that would be, but just an idea. I personally don't care too much about a bit of bloat in the contract and prefer lower execution costs, as long as it stays within deployable limits. So as an idea until a proper heuristic is in place, how about just compiling it with full specialization, then if the size exceeds the limit, avoid specialization for the biggest resulting function, then for the second biggest one etc, until it reaches a valid size or none is left.