gleisonsdm / DawnCC-Compiler

A source-to-source compiler for automatic parallelization of C programs through code annotation.
http://cuda.dcc.ufmg.br/dawn/
Other
60 stars 8 forks source link

Non-uniform architectures and systems #8

Closed dumblob closed 7 years ago

dumblob commented 7 years ago

I wonder how does DawnCC cope with non-uniform systems like big.LITTLE and in general with heterogeneous environments. Can DawnCC automatically distinguish which parallel computations need frequent access to RAM and should be therefore run CPU cores rather then overloading the CPU<->GPU bus (with frequent cache invalidation and other side effects)? Can DawnCC designate those computations not needing frequent access and run them on GPU if available? Can this be somehow configurable? Could this be somehow automatically runtime-configurable (CPUs are added/removed/enabled/disabled to the computer, GPUs are added/removed/... - due to power efficiency)?

brenocfg commented 7 years ago

DawnCC is a strictly Source-to-Source static compiler, so it won't perform any dynamic or runtime analysis to optimize for specific architectures or other metrics. In that sense, it's a bit of a "naive" implementation, in that it will attempt to offload basically any computation that its static analyses deem can be safely moved to an accelerator.

We do have a project on optimizing for specific architectures and applying cost models to determine when it is profitable to perform computation offloading, through static context-sensitive scheduling. However, that is a work in progress and has not been published yet, but I can keep you up to date with it if you're interested!

brenocfg commented 7 years ago

Oops, did not mean to close this just yet.

dumblob commented 7 years ago

but I can keep you up to date with it if you're interested!

Currently it's not a bottle neck for me, but I'm definitely interested. If this effort is somewhere publicly accessible, please let me know, so that I can follow the progress and news.

dumblob commented 7 years ago

There are also more diverse CPU/SoC core architectures than big.LITTLE. E.g. 3 different types of cores at once, each of them in different amounts - e.g. the MediaTek’s deca-core Helio X30, which has dual -A73 cores, four -A53 cores, and four -A35 cores.

pronesto commented 7 years ago

Hi. Yes, we are working actively on this! We have submitted a paper now. If you want, you can write me, and I can share the PDF with you (fernando@dcc.ufmg.br)

dumblob commented 5 years ago

Any news on this since then? I have to admit I didn't closely follow the commits in this repository, so a quick wrap up would be neat :wink:.

dumblob commented 5 years ago

ping :wink: (any news on this topic would be appreciated)

dumblob commented 2 years ago

@pronesto any news on this?

dumblob commented 2 years ago

ping @JWesleySM

pronesto commented 2 years ago

Hi @dumblob: yes, we do have news! We published last year a paper on a stochastic approach to optimize code scheduling in big.LITTLE architectures: https://dl.acm.org/doi/10.1145/3478288. You can get a free version here: https://hal-amu.archives-ouvertes.fr/MIC/lirmm-03366078v1. Before that, we published a paper that does context sensitive analyses to decide when to move code to the device: https://homepages.dcc.ufmg.br/~fernando/publications/papers/Poesia17.pdf

dumblob commented 2 years ago

I'll need to find some time to read them, thanks!

@VictorTaelin might be of interest to you (considering the discussion https://github.com/Kindelia/HVM/issues/38 ).

pronesto commented 2 years ago

Hi @dumblob: thank you for the heads up. I just browsed over the HVM project. Very cool! I saw that you've pointed out DawnCC to Victor. Thank you for that!