Closed cannin closed 1 year ago
Hi @DesmondYuan I found this issue interesting and would love to work on it. I have done some deep learning projects using TensorFlow framework and am keen to learn PyTorch. Could you please help me getting started with the project. Thanking you !
Hi! @cannin and @DesmondYuan
I am interested in working on this issue. I have used the TensorFlow framework for my projects and have also explored the PyTorch framework. I would love to make valuable contributions to this projects while enhancing my knowledge of PyTorch framework. After the going through the required resources, how should I proceed with the task?
Also, what will be the preferred medium of communication?
Thank You.
@Ishan2601 for the time being here on GitHub is good. I added a "Getting Started" section.
@cannin and @DesmondYuan I have gone through the documentation and I have also experience working with pytorch and tensorflow previously and have also refactored neural networks code from tensorflow to pytorch. I think I would be able to contribute to this project , could you help me setting this up and getting started with the issue. Thank You
@palashgarg0109 @Ishan2601 @Debanitrkl First of all, thank you for your interest in this project! As @cannin mentioned, please refer to the "Getting started" section to kick off. Once you have a preliminary test run with the existing TF code, you might be able to have a better understanding of what the summer project would look like!
I'm getting this error while setting up the Cellbox, could anyone help @DesmondYuan @cannin @judyueshen
@Debanitrkl thanks for digging into CellBox. If you find issues with CellBox, please post them directly on the CellBox repo: https://github.com/sanderlab/CellBox/issues so we can stay organized. Please repost your question there.
Feel free to post questions more focused on how to proceed with this project here.
@cannin @DesmondYuan I have successfully install and run the existing code on my PC. I am having some difficulty in understanding the data files expr_index.txt and expr.csv, what does each row and column represent in both the files? And also I think there is an error in ReadMe file at One Click Model Construction Step2: https://github.com/sanderlab/CellBox#step-2-use-mainpy-to-construct-models-using-random-partition-of-dataset there is no file name Example.random_partition.cfg.json in configs folder it should be Example.random_partition.json instead.
@palashgarg0109 Thanks for looking into the codes and thanks for the typo report. They are fixed now.
Regarding the data preparation, please refer to https://github.com/sanderlab/CellBox#data-files-in-data-folder for more information. Feel free to let us know if you need any additional clarification!
Hi @cannin and @DesmondYuan
I am interested in this project, and have experience with PyTorch. Is there anything I need to sign up for or do I just make a PR once done?
Elam
@T3chy Thanks. And yes a PR would work fine. You might want to load the tensorflow code first and do some quick test runs. Note that some of the models in model.py
, e.g. CoExp
class, are currently deprecated, so try to focus on CellBox
class and the abstract class PertBio
only to avoid distraction.
@DesmondYuan @cannin I have understood the code written using tensorflow framework to quite an extent and now I am very excited to work further on the project. Could you please suggest What I have to do next? Also I am keen to know what are the judging parameters the mentors keep into account at the time of selecting a student for a project knowing that there is competition in this project already.
@palashgarg0109 see https://nrnb.org/gsoc.html for how to apply; your proposal needs to reflect the goals (above) of the project
Hi all, quick question out of curiousity. Why switch to pytorch and not jax? If I understood the CellBox paper correctly, automatic differentiation is the key ingredient and jax seems to have more functionality built around autodiff for a variety of settings. This may be a dumb question but I was curious enough to ask. Thanks!
@vz415 Great question. Ideally JAX and Zygote in Julia should indeed work better with such tasks! We thought torch and TF are more widely used so they might be easier for a summer project. But if you are interested in JAX, please definitely feel free to mention it in the proposal. We would love to see it.
Cleanup in preparation for GSoC 2022.
Hi @cannin , I am interested in this project, I have experience with PyTorch and TensorFlow, and I did many machine learning projects.
Could you help me get started with the project, I hope to know if there are any updates from the last year?
Thanks in advance!
Hi @cannin @DesmondYuan , I have experience in both Tensorflow and Pytorch and have published some research works using both of them (This work is done in tensorflow v2, while this work is done in PyTorch.
I believe I can contribute to this project and interested on learning more about PyTorch and Tensorflow.
I have downloaded the existing code. Will run and see if there are any issues, will let you know here.
Hoping to have a long term collaboration starting with GSOC'23.
Thanks
NRNB has been accepted as a mentoring organization for GSoC 2023! Contributor applications open on March 20. Here are some useful links:
GSoC contributor guide NRNB project proposal template Eligibility requirements Full program timeline
Hi @cannin @DesmondYuan , I would like to contribute to the project. I am working on the present code. Hope to collaborate in GSoC'23.
Hello @cannin @DesmondYuan , I am Aditya Vardhan from IIT Bhilai,India. I am proficient with python, tensorflow and PyTorch and worked on many projects in the same. I find this project interesting, can you please help me get started . Going through the code right now, looking forward to contribute
Hello @cannin and @DesmondYuan ,
Thank you for setting up this project for us! I am Phuc Nguyen, a 3rd year undergraduate student at the University of Cincinnati majoring in Biomedical Engineering and Computer Science, and I have extensive experiences in Tensorflow, Pytorch, and relevant research in deep learning for life sciences. I will write an additional email to you to send my resume. I am best reached via email (nguye6tp@mail.uc.edu), and via comments in this issue. I am very excited about this project!
To get started, I would like to have some questions. Others may have similar questions as I do, so please add to the list if you have any:
What is the current progress of the project? I saw we had some discussions in this issue back in 2021. Had any of them joined the project and made some progress?
How will we run the models? Are we mentees provided with a cloud service or remote cluster, or do we have to run the code on our computer?
Are we provided all the necessary data and hyperparameters to retrain the models in Pytorch? For me the ultimate goal is not only to convert the code to Pytorch but also to have the trained models that have the same performance as in Tensorflow. So knowing all the hyperparameters and small subtle details is really important.
Is CellBox being used by other research groups or projects to estimate drug perturbation effects? I'm interested to see to what extent CellBox has been used by other researchers, because changing from Tensorflow to Pytorch means we have to update the documentation as well for others to use.
What is the next stage for CellBox? Is this an ongoing research of improving CellBox and adding more functionalities? If the research group is still looking for ways to improve CellBox performance by, say, changing the model architecture, I am happy to join the discussions as well if possible!
I am working on my proposal right now, may I send its first draft to you by early next week so I can have some feedback from you? And is it better for me to post the questions here or email you?
Thank you! Phuc
@Mustardburger answers to your questions:
Hi @cannin @DesmondYuan @judyueshen I am a Masters student studying AI in University of Hamburg, Germany. I have knowledge in topics like statistical ML, NLP, computer vision. I have worked in multiple projects in Python, Pytorch, Keras. Apart from datascience stack I also have experience working in Java, Php, Swift. I came across this topic and got intereted on work on it. I am a bit late to apply but I am interested to contribute and gain experience from this project. Applying ML in health and biological problems has been an area of my curiosity. Could you please help me getting started with the project and if any call can be setup for a discussion
@Anirbanbhk88 there is still time to apply; start by focusing on the "Getting Started" section; email first if questions arise as you draft an application.
This project is an active GSoC 2023 project. Closing this issue because it is no longer available for other contributors/students.
Background
CellBox (https://www.cell.com/cell-systems/pdfExtended/S2405-4712(20)30464-6) is a machine learning algorithm (built using TensorFlow) for generating network models to predict for cellular responses to large scale perturbation experiments. The output is executable models capable of predicting the effects of cancer therapies. CellBox includes explicit models of cell dynamics in a machine-learning framework, and therefore can be both accurate and interpretable.
Goal
The current CellBox models are limited by the static graph feature of TensorFlow v1. To further increase the model scope, the Sander lab is working on the transition of CellBox algorithm from TensorFlow to PyTorch, a different machine learning framework that uses dynamic graph to optimize. A dynamic computational graph will allow flexible manipulation of the network structure and the introduction of stochasticity into differential equations during model training. The goal of this summer project is to help in that effort. A student would need to be able to construct CellBox in PyTorch and then reproduce the training result of the original model (in TensorFlow), but also to develop new features of the optimization tool.
Getting Started
Difficulty Level 2
The student will need to learn how to interpret CellBox algorithm and translate the TensorFlow “language” to the PyTorch “language”.
Skills
Public Repository
Potential Mentors
Augustin Luna Bo Yuan