Closed hrshtv closed 3 years ago
Hi @hrshtv,
Thank you for your interest!
At this moment, there is no OO class for Milepost. However, I plan to gradually convert CK to be more pythonic in 2021.
In the meantime, you can extract MILEPOST features for a given CK program as follows:
import ck.kernel as ck
r=ck.access({'action':'extract', 'module_uoa':'program.static.features', 'data_uoa':'cbench-automotive-susan'})
if r['return']>0: ck.err(r)
features=r.get('dict',{}).get('features',{})
You need to have MILEPOST GCC installed via CK.
If you want to extract features from an arbitrary source code, just copy paste some CK program to a dummy CK program, add your source code and add it to CK meta, something as follows:
ck cp program:cbench-automotive-susan program:my-dummy-program
ck find program:my-dummy-program
# Add source code there; and add its name to .cm/meta.json
ck extract program.static.features:my-dummy-program
If it sounds useful, I can provide more explanations ...
Also, @ChrisCummins is working on a related infrastructure and he mentioned that he plans to release it soon - they are using cool deep learning techniques to learn optimization heuristics and you may be interested to follow their projects too!
Thanks for the explanation! Is there any documentation that explains the arguments of the functions used? For example, ck.access({...})
Some limited description is available at https://ck.readthedocs.io/en/latest/src/ck.html#ck.kernel.access .
This function always takes dict as input with
You can find the input keys and the output dictionary for a given module and action from the cmd as follows:
ck extract program.static.features --help
UOA is an abbreviation for CK UID or alias, i.e. you can use both the user friendly name such as "program.static.features" or it's internal UID (92a02f0445148203)
My hope/goal is to update all help pages for major APIs in 2021 ...
Hi @hrshtv, I'm following up here at Grigori's request with something that might be of interest to you. We just launched CompilerGym, a research platform for compiler autotuning. In particular, it exposes a handful of different program representations through a simple python interface.
For LLVM, we have a variety of different program representations, though not milepost (I'll look seeing how much work it would take to add).
The general usage would be:
$ clang-10 -emit-llvm -c myapp.cc
>>> import gym
>>> import compiler_gym
>>> from compiler_gym.service.proto import Benchmark, File
# load the LLVM-IR file:
>>> path = "/path/to/myapp.bc"
>>> benchmark = Benchmark(uri=f"file:///{path}", program=File(uri=f"file:///{path}"))
# create a compiler session:
>>> env = gym.make("llvm-v0")
>>> env.reset(benchmark)
>>> env.observation["Programl"]
<networkx.classes.multidigraph.MultiDiGraph object at 0x7f9d8050ffa0>
>>> env.observation["Inst2vec"]
array([[-0.26956588, 0.47407162, -0.36637706, ..., -0.49256894,
0.8016193 , 0.71160674],
[-0.59749085, 0.63315004, -0.0308373 , ..., 0.14833118,
0.86420786, 0.44808227],
[-0.59749085, 0.63315004, -0.0308373 , ..., 0.14833118,
0.86420786, 0.44808227],
...,
[-0.37584195, 0.43671703, -0.5360456 , ..., 0.6030259 ,
0.82574934, 0.6306344 ],
[-0.59749085, 0.63315004, -0.0308373 , ..., 0.14833118,
0.86420786, 0.44808227],
[-0.43074277, 0.8589559 , -0.35770646, ..., 0.28785184,
0.8492773 , 0.8914213 ]], dtype=float32)
>>> env.observation["Autophase"]
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0])
where ProGraML and [inst2vec]() are two recent state-of-the-art deep learning representations.
Cheers, Chris
Edit: typos, see question below
Hey Chris,
Thanks for sharing - looks really cool!
I got stuck with the above example on the following line:
env.reset(benchmark="file:////home/gfursin/work/susan.bc")
ValueError: Unknown benchmark "file:////home/gfursin/work/susan.bc"
The example at https://github.com/facebookresearch/CompilerGym worked fine:
...
; Function Attrs: nounwind
declare i32 @sprintf(i8*, i8*, ...) #3
; Function Attrs: nounwind
declare double @pow(double, double) #3
attributes #0 = { nounwind uwtable "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind readnone speculatable willreturn }
attributes #2 = { "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #3 = { nounwind "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #4 = { nounwind }
[ 0 0 7 3 4 7 6 4 1 6 0 0 0 14 0 13 22 5
19 34 5 12 23 7 2 0 2 21 0 2 12 0 13 23 7 6
0 32 0 0 0 1 7 0 0 23 0 0 0 0 14 136 106 5
0 61]
...
Will dig further into your project during vacations.
Thanks again for the update!!! Grigori
I moved this question here: https://github.com/facebookresearch/CompilerGym/issues/12 .
This link has a nice interface for extracting all MILEPOST static program features with the click of a button. Can we do the same thing programmatically in python? I'm looking for something along the following lines:
Is something like this possible?