nrnb / GoogleSummerOfCode

Main documentation site for NRNB GSoC project ideas and resources
114 stars 38 forks source link

Add ARM (Raspberry Pi) LLVM Support to High Performance Simulator libRoadRunner #73

Closed hsauro closed 3 years ago

hsauro commented 7 years ago

Background

libRoadRunner (http://libroadrunner.org/) is a high performance SBML based simulator that exploits LLVM (http://llvm.org/) to achieve very fast simulation times. The current version of libRoadRunner runs on Intel based processors on Windows, Mac and Linux.

Goal

The goal of this project is to modify libRoadRunner so that it can be deployed on ARM based systems such as the Raspberry Pi 3 or other suitable single board ARM based computers. The intent is to create a small cluster of Rasberry Pis in order to run simulations in parallel. The work will include a study to investigate the cost effectiveness of using small form factor ARM based boards as a cluster machine for systems biology research.

Skills Required:

Cross-platform development in C++ using CMake. Familiarity with LLVM, Raspberry Pi, ODROID-XU4 or PINE64 would be a definite adavantage. However LLVM training on site will be possible.

If successful a research paper will be written and published.

Public Repository

https://github.com/sys-bio/roadrunner

Possible Mentors

Herbert M Sauro Andy Somogyi Matthias Konig

Main Contact

Herbert M Sauro

References

libRoadRunner: a high performance SBML simulation and analysis library. Endre T. Somogyi Jean-Marie Bouteiller James A. Glazier Matthias König J. Kyle Medley Maciej H. Swat Herbert M. Sauro, Bioinformatics (2015) 31 (20): 3315-3321.

0u812 commented 7 years ago

Added @0u812 so I'll get updates

matthiaskoenig commented 7 years ago

Just to add: This is an important milestone for the translation of computational models in the clinics and their routine application, e.g. risk assessment and personalized models. In the future ODE based models have to be routinely transferred from desktop development environments towards hand held ARM based devices like mobile phones and tablets or mini-ARM computers like the Raspberry Pi. Providing the infrastructure to readily deploy SBML models on such devices opens the doors for many new applications. A second main field of application is the calculation of distributed models on cheap ARM clusters, with many multi-core boards already in the pipeline.

doppioandante commented 7 years ago

Hello, I'm interested in this project. I'm quite comfortable with C++, cmake, linux and git. I have no experience with LLVM but I don't think it would be a problem for me to learn it. Are there any other prerequisites? What is the expected level of background in biology?

hsauro commented 7 years ago

Minimal background in biology is required. However we will require someone to be able to write portable software so that it can run on Windows, Mac and Linux.

Herbert Sauro

On Fri, Jan 27, 2017 at 6:59 AM, Enrico Lumetti notifications@github.com wrote:

Hello, I'm interested in this project. I'm quite comfortable with C++, cmake, linux and git. I have no experience with LLVM but I don't think it would be a problem for me to learn it. Are there any other prerequisites? What is the expected level of background in biology?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nrnb/GoogleSummerOfCode/issues/73#issuecomment-275684836, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAZDm-bjMHQn7_BWHW1x3rXOLAp_Hpgks5rWgZugaJpZM4Ljs99 .

doppioandante commented 7 years ago

I'm okay with multiplatform code. Regarding clustering: is there support for clusters of x86-64 machines? Should it be roadrunner's responsibility to manage the cluster and to schedule simulations?

hsauro commented 7 years ago

I would talk to Kyle Medley and Matthias Konig about the clustering (cced on this), they have a better idea of this than I do. One thing I'dlike to see is libroadrunner running ona small raspberry pi cluster to see how it performs. But we'd also like to see it work on much larger machines.

Herbert Sauro

On Fri, Jan 27, 2017 at 3:55 PM, Enrico Lumetti notifications@github.com wrote:

I'm okay with multiplatform code. Regarding clustering: is there support for clusters of x86-64 machines? If not, would it be roadrunner's responsibility to manage the cluster and to schedule simulations?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nrnb/GoogleSummerOfCode/issues/73#issuecomment-275806143, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAZDvvd2LvfRXIzVC_qFwnZUUZPqN-7ks5rWoQDgaJpZM4Ljs99 .

0u812 commented 7 years ago

The simulator isn't cluster-aware so you don't have to worry about that for this project. I think the best place for a cluster implementation would be outside of the simulator itself, ideally using some pre-existing framework.

matthiaskoenig commented 7 years ago

I agree with Kyle on this. The important part is getting to run the simulator on a single ARM core. An external framework will than be used to distribute the simulations on the cores in the cluster, for instance if the simulations are independent something like a task queue

Celery http://www.celeryproject.org/ Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

The distribution of tasks to the cluster nodes, the synchronisation and calculation of derived results (post processing of cluster results) will be managed outside of roadrunner.

On Sat, Jan 28, 2017 at 2:00 AM, Kyle Medley notifications@github.com wrote:

The simulator isn't cluster-aware so you don't have to worry about that for this project. I think the best place for a cluster implementation would be outside of the simulator itself, ideally using some pre-existing framework.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nrnb/GoogleSummerOfCode/issues/73#issuecomment-275813802, or mute the thread https://github.com/notifications/unsubscribe-auth/AA29utaxkRJ-H6KmcupLGNWwQjoae1vZks5rWpMUgaJpZM4Ljs99 .

-- Matthias König Junior Group Leader LiSym - Systems Medicine of the Liver Humboldt-University Berlin, Institute for Theoretical Biology https://www.livermetabolism.com konigmatt@googlemail.com Tel: +49 30 20938450 Tel: +49 176 81168480

doppioandante commented 7 years ago

I've been poking some more at libroadrunner and I managed to build it.

I'm currently trying to understand what specifically the ARM port would entail. All the libraries that roadrunner depends upon are readily available for arm64, armel and armhf. The code is moderately big and I still have to look at it in detail, but so far I haven't found anything that should be explicitly rewritten for ARM, the llvm backend is generic and targets the LLVM IR. What am I missing?

The ARM cluster sounds like a very interesting project and I would gladly take it, provided that it fits the scope of this project.

Linux is an obvious target. Windows has ARM support with the IoT Core version, do you consider it as a possible target? iOS is not a choice because of their security policies against JIT compilers (unless I add a javascript backend...), while an eventual Android support should come free with linux support.

0u812 commented 7 years ago

Hey Enrico. First, some background on LLVM. There have been three JIT engines implemented in the library to date. In chronological order:

  1. The legacy JIT (the one roadrunner uses, no ARM support)
  2. MCJIT (has ARM support)
  3. ORC (has ARM support I believe)

The first two have essentially the same API. Andy Somogyi created a branch which uses the MCJIT instead. We decided not to merge this branch into the main because it causes breaks some language wrappers. In other words, we're still using the legacy JIT API (which was removed in LLVM 3.6). It's a little bit backwards at this point, but it works, which is the most important thing.

What I would suggest is that you copy Andy's MCJIT code generator (which lives in the source/llvm directory) into our main branch without overwriting our current LLVM code. I can explain how to do this. Once you get to the point where you have the ARM part working in the main branch, then we can look at the cluster stuff. Sound good?

0u812 commented 7 years ago

Also, as far as platforms go, I would focus on the Raspberry Pi first, then if you have time (and are so inclined) you can look into the other platforms you mentioned. I think @hsauro is also most interested in the Pi at this point, but he can correct me if I am wrong.

doppioandante commented 7 years ago

@0u812 Thanks, the issue is much clearer to me now. I will try to get a build on Raspbian tomorrow. In any case, the end goal would be to rebase the mcjit branch onto the current dev branch?

0u812 commented 7 years ago

For stability reasons, I think we should leave the current LLVM generator in source/llvm alone. Instead, you can copy the source/llvm directory from the mcjit branch into a directory like e.g. source/mcjit of the main branch. Then you would have two LLVM generators. That gives us a good tradeoff between stability (we won't have to worry about breaking the existing code) and being able to experiment with the mcjit generator. If anything is unclear I can Skype with you to help you work it out.

Btw just to let you know, our main branch is "develop". The "master" branch is only used for releases (we never commit to it directly). When you clone the repo, it will automatically checkout the "develop" branch. I just wanted to explain that so you know why.

matthiaskoenig commented 7 years ago

If there is a first prototype of the ARM branch compiling for the raspberry let me know. I would love to test this here on my device and give feedback if it's working.

Great work, I am very happy that this is getting some steam.

On Wed, Feb 1, 2017 at 12:03 AM, Kyle Medley notifications@github.com wrote:

For stability reasons, I think we should leave the current LLVM generator in source/llvm https://github.com/sys-bio/roadrunner/tree/develop/source/llvm alone. Instead, you can copy the source/llvm directory from the mcjit branch https://github.com/sys-bio/roadrunner/tree/mcjit/source/llvm into a directory like e.g. source/mcjit of the main branch. Then you would have two LLVM generators. That gives us a good tradeoff between stability (we won't have to worry about breaking the existing code) and being able to experiment with the mcjit generator. If anything is unclear I can Skype with you to help you work it out.

Btw just to let you know, our main branch is "develop". The "master" branch is only used for releases (we never commit to it directly). When you clone the repo, it will automatically checkout the "develop" branch. I just wanted to explain that so you know why.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nrnb/GoogleSummerOfCode/issues/73#issuecomment-276522067, or mute the thread https://github.com/notifications/unsubscribe-auth/AA29ukcqVkP1PVfybRACjABcX8vVj2hhks5rX72sgaJpZM4Ljs99 .

-- Matthias König Junior Group Leader LiSym - Systems Medicine of the Liver Humboldt-University Berlin, Institute for Theoretical Biology https://www.livermetabolism.com konigmatt@googlemail.com Tel: +49 30 20938450 Tel: +49 176 81168480

doppioandante commented 7 years ago

Thanks for the help, unfortunately I have a pretty nasty fever today so this will have to wait a bit. Some questions in the mean time: do you prefer compilation to happen on the host system or maybe to cross compile? Some people are allergic to cross compiling toolchains but the advantage is faster builds and more updated compiler. With cmake is not even that difficult and I've done it before. Regarding compiler are there minimum version of what is supported? I'm using the one from the CI as reference now. And as last, do I have to care for the bindings that are not Python? E. g. the Delphi one looks old.

matthiaskoenig commented 7 years ago

I only care about the python bindings, this is what I use in production running my simulations (and will also on the cluster). I think this is sufficient because most libraries required later on are available in python (have python bindings), i.e. things like pyspark, MPI, ...

Compiler questions are for Kyle. Personally, I don't have a problem to compile on the host system, but cross compilation will make the distribution easier. I have the vision to generate a custom ROM for the rasperry which contains the linux OS, and all required libraries installed to hook the rasperries up. This would include the ARM roadrunner build. Than you only have to distribute the ROMs on the rasperries, set the IPs for the compute nodes and than distribute the workload from a head node of the cluster (probably a desktop machine) on the cluster.

On Wed, Feb 1, 2017 at 8:09 AM, Enrico Lumetti notifications@github.com wrote:

Thanks for the help, unfortunately I have a pretty nasty fever today so this will have to wait a bit. Some questions in the mean time: do you prefer compilation to happen on the host system or maybe to cross compile? Some people are allergic to cross compiling toolchains but the advantage is faster builds and more updated compiler. With cmake is not even that difficult and I've done it before. Regarding compiler are there minimum version of what is supported? I'm using the one from the CI as reference now. And as last, do I have to care for the bindings that are not Python? E. g. the Delphi one looks old.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nrnb/GoogleSummerOfCode/issues/73#issuecomment-276588070, or mute the thread https://github.com/notifications/unsubscribe-auth/AA29un4ARrGaWtAiYAF52g2Bbl5_rXJaks5rYC-ugaJpZM4Ljs99 .

-- Matthias König Junior Group Leader LiSym - Systems Medicine of the Liver Humboldt-University Berlin, Institute for Theoretical Biology https://www.livermetabolism.com konigmatt@googlemail.com Tel: +49 30 20938450 Tel: +49 176 81168480

hsauro commented 7 years ago

Don't worry about the Delphi bindings, I look after those and they shouldn't change since we won't be changing the API.

Herbert Sauro

On Tue, Jan 31, 2017 at 11:10 PM Enrico Lumetti notifications@github.com wrote:

Thanks for the help, unfortunately I have a pretty nasty fever today so this will have to wait a bit.

Some questions in the mean time: do you prefer compilation to happen on the host system or maybe to cross compile? Some people are allergic to cross compiling toolchains but the advantage is faster builds and more updated compiler. With cmake is not even that difficult and I've done it before.

Regarding compiler are there minimum version of what is supported? I'm using the one from the CI as reference now.

And as last, do I have to care for the bindings that are not Python? E. g. the Delphi one looks old.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nrnb/GoogleSummerOfCode/issues/73#issuecomment-276588070, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAZDvqRl9-qN_fdoD4U3qcNi-U7Ef6Tks5rYC-ugaJpZM4Ljs99 .

0u812 commented 7 years ago

@doppioandante no worries. Hope you get well soon. I don't think you will have to change the API so the language bindings should not be affected. On ARM, I usually build with just the Python bindings enabled.

0u812 commented 7 years ago

Hi @doppioandante just checking if you're still out there

doppioandante commented 7 years ago

Hello, I'm sorry for not checking in but I had an harder exam session than anticipated and the GSOC just went by. The upside is that I took a course in basic mathematical biology, so I'm more interested in this now. I plan to work on this a bit on my own, I currently have a branch(mcjit-llvm3.8) based on mcjit with a few edits for (gasp) llvm 3.8, and it almost compiles (I have an unrelated problem with the bzip2 dependency of sbml).

hsauro commented 7 years ago

That sounds interesting, do you have access to something like a raspberry pi to try it out?

Herbert Sauro

On Wed, Aug 23, 2017 at 1:44 PM, Enrico Lumetti notifications@github.com wrote:

Hello, I'm sorry for not checking in but I had an harder exam session than anticipated and the GSOC just went by. The upside is that I took a course in basic mathematical biology, so I'm more interested in this now. I plan to work on this a bit on my own, I currently have a branch( mcjit-llvm3.8 https://github.com/doppioandante/roadrunner/tree/mcjit-llvm3.8) based on mcjit with a few edits for (gasp) llvm 3.8, and it almost compiles (I have an unrelated problem with the bzip2 dependency of sbml).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nrnb/GoogleSummerOfCode/issues/73#issuecomment-324456962, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAZDkRcZgf9vDR2F-bUJh3TEwl0R_W7ks5sbI8igaJpZM4Ljs99 .

0u812 commented 7 years ago

Hi Enrico, that sounds like an interesting development! However, please keep in mind that we already have a branch which supports MCJIT and ARM, but we can't merge it in because it breaks other things. The main point of this project is to provide a solution that doesn't affect the current code by e.g. putting your new code in a source/llvm2 directory and making it independent from the current LLVM backend. If you are interested in contributing, we could get in touch at some point to discuss specifics.

doppioandante commented 7 years ago

That sounds interesting, do you have access to something like a raspberry pi to try it out?

I will test it on my raspberry, but cross-compiling to ARM is not finished yet

Hi Enrico, that sounds like an interesting development! However, please keep in mind that we already have a branch which supports MCJIT and ARM, but we can't merge it in because it breaks other things.

OK, I will move it to another folder. I will get in touch by email.

0u812 commented 7 years ago

Great! My email is medleyj@uw.edu. I'm pretty busy this week but I should be more available after that.

IceCereal commented 4 years ago

Hello, since this is still open and is in GSoC this year, is there an update available for anyone viewing this straight from the GSoC website?

hsauro commented 4 years ago

No update, it's the same project,

Herbert Suro

On Sun, Feb 23, 2020 at 8:25 AM IceCereal notifications@github.com wrote:

Hello, since this is still open and is in GSoC this year, is there an update available for anyone viewing this straight from the GSoC website?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/nrnb/GoogleSummerOfCode/issues/73?email_source=notifications&email_token=AAIBSDUE2INIISYMY2C3Y6LREKPOBA5CNFSM4C4OZ562YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMWAF7I#issuecomment-590086909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIBSDQORGIE5442CYUDASLREKPOBANCNFSM4C4OZ56Q .

-- Herbert Sauro, Professor University of Washington, Bioengineering 206-685-2119, www.sys-bio.org hsauro@uw.edu Books: http://books.analogmachine.org/