mlcommons / ck

Collective Mind (CM) is a small, modular, cross-platform and decentralized workflow automation framework with a human-friendly interface and reusable automation recipes to make it easier to build, run, benchmark and optimize AI, ML and other applications and systems across diverse and continuously changing models, data, software and hardware
https://doi.org/10.5281/zenodo.8105339
Apache License 2.0
586 stars 109 forks source link

Question about the usage of compiler autotuning #133

Closed joy369 closed 2 years ago

joy369 commented 3 years ago

Hi all,

I am pretty new here, and I notice this cool place by this paper. To my understanding, one feature of CK is providing a framework to automatically tune the optimization flags of compiler for a program. Furthermore, it could be done on cross-platform device. It's super convenient!

However I really have difficulty to read the following documentation even with the help of the paper above. https://ck.readthedocs.io/en/latest/index.html https://github.com/ctuning/ck/wiki/Compiler-autotuning https://github.com/ctuning/ck/wiki/autotuning

Perhaps I need to spend more time to dig into those materials. But the problems below is hard to me.

  1. The only way I know the usage of CK front-end shell command is typing it with --help to my terminal. I could not find those API documents online.

  2. Isn't there a tutorial about guiding user to add their own library, tool-chain and then automatically run on the device via adb? Let's say I want to use another toolchain to reproduce the zlib optimization from the paper step by step without using the "ck replay" command. I will: a. Download zlib b. Downlaod a toolchain Then I feel really chaos to setting up CK to let it do the optimization for me.

I know I stuck at the very beginning. Yet may I have some hints from you? Any reply is appreciated. Thanks a lots!

gfursin commented 3 years ago

Hi @joy369,

Thank you very much for your interest! I do understand your frustration and share the same feelings!

The main issue is that I was mostly focusing on finalizing the prototyping stage of the CK technology and had very limited resources to improve docs and provide user-friendly tutorials. That's really a shame since I had similar feedback from other users over time. However, since I finally finished prototyping the CK concept, I finally started addressing some of the above issues.

For example, it's possible to search for CK modules here and see their APIs with related python code and dependencies. However, there is no a tutorial on adding new toolchains and compilers and I agree that it's very tricky to figure it out yourself :( ...

I am preparing the CK2 project for 2021 with a focus on autotuning and one of the priorities is to start adding tutorials to make the onboarding process for new users easier. Would you like us to keep you informed and exchange ideas? Just send me an email or register at the CK mailing list and/or here.

In the meantime, you may be interested to have look at related compiler autotuning projects from my colleague @ChrisCummins.

Thanks again for your interest and sorry about the lack of tutorials :( ...

CC @phesse001

gfursin commented 3 years ago

By the way, the new Android toolchain can be generally added by copying and modifying the most related CK OS android* entry from here: https://github.com/ctuning/ck-env/tree/master/os .

Usually you do it via:

ck pull repo:ck-env
ck cp ck-env:os:android29-arm64 ck-env:os:android30-arm64
vim `ck find ck-env:os:android30-arm64`/.cm/meta.json

You can then use the new description when cross-compiling your program:

ck compile program:{some program} --target_os=android30-arm64
ck run program:{some program} --target_os=android30-arm64

The detection of Android toolchains is done in these soft detection plugins. They can be modified to set up environment variables for new toolchains if needed ...

Adding a new program (library) is briefly described here.

I tried to separate program description from toolchains. The idea is that when you add a new program (based on some existing one similar to Wikipedia), the CK should automatically pick up the toolchain from the existing ones, set up environment variables (using soft detection plugins) and generate tmp scripts to build and run your program automatically ...

Hope it's of any help. I hope that we will find resources to add tutorials to describe this process so that one do not need to navigate through all APIs, meta descriptions and scattered documentation.

Cheers!

ChrisCummins commented 3 years ago

Hi @joy369, since Grigori recommended my work I thought it would be worth a mention that I believe I have a project that will help with precisely what you're after, specifically: tutorials aimed at how to apply autotuning techniques to building your own software, and how to add support for new compilers/toolchains. It's due to launch in a couple of weeks so I'll post a link here when it's live!

Cheers, Chris

gfursin commented 3 years ago

That's great! Thanks @ChrisCummins - looking forward to it. I am also interested to see if we could connect your tools with the CK workflows to reuse CK benchmarks and data sets - that would be cool!

joy369 commented 3 years ago

@gfursin Big thanks of you for pointing me in a right direction to investigate! :D It seems to me that I could not digest all the information by one glance, I will try read and understand them.

And yes I do like to hear the news about this toolkit. I will register it later. Honestly I just peek into the research area of compiler auto tuning (or let's say the the options selection) recently. I have read some papers yet still many things are pretty new and interesting to me, just list a few, like:

  1. Effective selection methods to choose the options. (I still find random selection would be one of the baselines)
  2. Thermal problem of embedded devices when implementing the auto tuning on cross-platform
  3. Via CK auto-tuning, profiling method relays on timing stamps or sampling criteria? ... It will be really great for me to have a chance to have a talk with the master like you. :D

@ChrisCummins Look forward to your tutorial. I will check your github and give it a try!

Just one question, although I know the same setting will cause the performance increase/decrease significantly even on two different *.cpp files. I am a bit curious. Would the tutorial cover the flow for project which depends on CMake to build?

Thank you for all of your comments!

gfursin commented 3 years ago

Hi @joy369,

Sorry for the delay in replying - I am preparing a new project and swamped at the moment.

The questions you listed are very reasonable and interesting.

1) One of the main reasons why I postponed my academic research on ML-based autotuning and started working on the CK framework was to make it possible to test different design space exploration methods across diverse programs, data sets, compilers and systems in a collaborative and reproducible way while avoiding many pitfalls. I mentioned some in these two talks this summer: Google and FastPath and these papers: 1 and 2.

Another reason was to collect enough diverse programs and datasets from the community to make the use of ML for autotuning statistically meaningful (particularly important for deep learning). Random search makes sense at the beginning of you exploration when you do not have enough knowledge about your optimization space, but if you have enough samples, model-based approach should eventually beat random search.

The last reason was to distribute design space exploration across as many volunteers with diverse platforms and environments as possible:

As you can see, we collected lots of optimization results from the community 2 years ago during our project with the Raspberry Pi foundation. Unfortunately, I didn't do more analysis and experiments since then because I switched my focus on reproducible ML system co-design. I may get back to this topic in 2021 and will be happy to keep in touch...

Also have a look at this paper (unless you saw it already): "A Survey on Compiler Autotuning using Machine Learning": https://arxiv.org/abs/1801.04405 .

2) I am not sure what do you mean by the thermal problem of embedded devices during autotuning ;)

3) In the CK we use system timing by default and we can plug oprofile and other sampling profiling if needed (for example to collect hardware counters).

If you want, we can arrange a conf-call at some point - just send me an email at Grigori.Fursin@cTuning.org ...

And good luck with the compiler autotuning R&D ;) ...

ChrisCummins commented 3 years ago

Hi @joy369, I'm following up here to announce CompilerGym, a research platform for compiler autotuning. In particular, it makes it very easy to experiment with different optimization strategies through a simple python interface.

If you compile your program to LLVM-IR:

$ clang++ -emit-llvm -c myapp.cc

you could then run a simple random tuning strategy using:

$ pip3 install compiler_gym
$ python -m compiler_gym.bin.random_search --env=llvm-ic-v0 --benchmark=file:///$(pwd)/myapp.bc

For cmake integration you would just need to figure out how to adapt your build to append the -emit-llvm flag to generate the LLVM-IR bitcode files. Shortly after the new year we will be adding three tutorials on (1) Makefile integration, (2) adding support for a new compiler, and (3) using reinforcement learning to build optimization policies. These will be added to the documentation website. Stay tuned!

Cheers, Chris

gfursin commented 3 years ago

Thanks for sharing, Chris - compiler_gym looks very interesting. Looking forward to check it out in more detail during vacations! Happy holidays and keep up this great work!

gfursin commented 2 years ago

The latest version of the CK prototype is stable and successfully used by different companies and organizations to automate MLPerf inference submissions. That is why we have decided not to touch it anymore expect bug fixes in the CK kernel and existing MLOps components. Instead, we are developing a new CK2 framework based on user feedback.

I close this issue.