slothy-optimizer / slothy

Assembly super-optimization via constraint solving
https://slothy-optimizer.github.io/slothy/
Other
167 stars 10 forks source link

More general structure for loop parsing #91

Open dop-amin opened 1 month ago

dop-amin commented 1 month ago

This PR adds a more generic strucutre to deal with loops: Subclasses of Loop implement a specific type of loop, for exmaple having a certain sequence of instructions at the end. The extraction works the same and is thus implemented in Loop, while the methods to produce the code differ.

I am open to suggestions for how to improve this approach; for my usecases wrt. armv7m it worked fine like this.

dop-amin commented 6 days ago

@hanno-becker I think the PR now is a good starting point for abstraction of loop handling. However, I think there may pop up new cases in the future which will require slight tweaks, e.g., passing more/different inputs to the loop subclasses. I already pass some data where I know it will be useful from our experiments with Armv7m, esp. for more complicated loop constructions that check against a pointer that is modified inside the kernel.

Are there any tests you'd like me to run except CI? I fully optimized one aarch64 example already and the output code still passes the test.

dop-amin commented 7 hours ago

Thank you @dop-amin, overall this looks good -- better flexibility for loop forms has been an embarrassing gap for a while.

While you're at it, could you improve the doc a bit, and hoist the abstract Loop class somewhere where it can be shared between architecture models?

Also, would you mind adding minimal examples to examples.py which demonstrate each loop form, so we run them in CI?

I think this should all be done by now.