Closed GemmaTuron closed 1 year ago
In this example, following the tasks of Wednesday 21st June:
I have created all the templates for the Interns, spend a couple of hours revising the GitHub Project and updating the tasks. I have been working on identifying the bug in the GitActions, solved in this issue. I have set a meeting with @miquelduranfrigola to discuss in detail his comments on the Model testing discussion but I haven't been able to start writing yet, i am to do so by the end of the week. Looking forward to the WCAIR next lesson!
Pro tip: adding the links to the issues and discussions you mention will be very helpful!
@GemmaTuron @DhanshreeA
Today Tasks List: Wednesday, June 21, 2023
eos9yui
on Colab https://github.com/ersilia-os/eos9yui/issues/3#issuecomment-1600386625eos31ve
https://github.com/ersilia-os/eos31ve/issues/6#issuecomment-1601044120eos2r5e
https://github.com/orgs/ersilia-os/projects/1/views/10?pane=issue&itemId=29234354eos81ew
https://github.com/orgs/ersilia-os/projects/1/views/10?pane=issue&itemId=29234323eos44zp
https://github.com/orgs/ersilia-os/projects/1/views/10?pane=issue&itemId=29234297Model refactoring eos2r5e
is still under work, try to complete it as soon as possible. Done testing of model eos31ve
on CLI and COLAB, testing on DockerHub rightnow.
Thanks.
Great thanks Zakia, I am giving you an extra model just because won't be able to review until my morning tomorrow and you probably start working earlier than I do in your timezone. Just in case you finish your previous tasks
Hi @ZakiaYahya
Please look at this issue. There is so much more information you can add when a workflow is failing, I have written an example of the level of detail you should be aiming for, I hope this is helpful.
Hello @GemmaTuron @DhanshreeA
Today Tasks List: Thursday, June 22, 2023
Due to slower network today, my work has been affected. I have done refactoring of model eos2re5
, it is working both locally and inside ersilia --repo-path, but unable to push changes till now due to internet. I'll open PR as soon as i successfully pushes the changes in the repo.
Now, working on model refactoring eos2b6f
, try running it locally but encountering errors, working on it. Once i get why the error is happening i'll let you know in detail.
Trying to do model testing eos5505
simultaneously. Able to run it on colab somehow but due to unstable internet it is taking way too long in fetching from ersilia. Once it done i'll post the results in the relevant issue.
I hope my internet get stable by tomorrow.
Thanks.
Thanks.
Hello @GemmaTuron @DhanshreeA
Tasks List: Friday, June 23, 2023
Will try to work on ERROR: Could not install packages due to an OSError: [Errno 28] No space left on device
on weekend.
Hi @ZakiaYahya !
Thanks, make sure to eliminate all the other Ersilia Models you have been working with running the ersilia delete
command, and revise your conda envs as well after doing this. It will free up space in your system
Hello @GemmaTuron Right, i'll delete it all by today. Thanks.
Hi @ZakiaYahya I went through your update on eos44zp
. Your PR has been simply closed (and not merged) since the model was fairly recent and only required the missing workflow files. Gemma has added them in a separate commit.
Regarding eos2re5
, I have left comments on the issue in that repository. The issues with docker build seem to be coming from using conda command through Docker's RUN directives. I will also have to look more into that - but for the time being I have linked a resource that should be useful in understanding what is going on. Also, the model isn't fully cleaned and there are a few more files that should be removed.
Hello @GemmaTuron @DhanshreeA
Tasks List: Monday, June 26, 2023
I have done model testing and Re-refactor model eos2b6f
again with newer version of RDkit that is compatible. I have open PR on it. I'm currently working on model eos2re5
that is failing at after merging as well going through model eos44zp
"Space allocation problem" as well but still didn't able to resolve it.
Thanks.
Hi @ZakiaYahya ! the space problem is related to GitHub actions, can you check for today's meeting what are the limits on Git Actions and we'll discuss what to do?
Hello @GemmaTuron Yes, i'm working on it. Different forums mention different cache sizes but most of them mentioned that they are now offering Upto 10Gb for workflow actions. I've gone through different forums, we will discuss it in today's meeting
Hello @GemmaTuron @miquelduranfrigola @DhanshreeA
After going into detail, i found three solutions to either free up pre-cached tools
or by using swap space technique
and i'll discuss it in today's general meeting these three workarounds
(1) Removing pre-cached tools from the github runner:
https://github.com/actions/runner-images/issues/2840#issuecomment-790492173
(2) Adding Swap-Space:
https://github.com/pierotofy/set-swap-space
(3) Deleting Pre-cached tools + Swap-space (by doing this available memory is around ~31GB):
https://github.com/jlumbroso/free-disk-space/releases/tag/v1.1.0
Thanks
This is great stuff, @ZakiaYahya - I have never explored these options. Let me invoke @honeyankit and @GrantBirki from GitHub, let's see if they have additional feedback.
Thank @miquelduranfrigola Yeah sure, it would be very helpful.
Hey there! I have never explored these options either so I would be curious to follow along with your PRs and see how it all works!
Hello @GemmaTuron @DhanshreeA
Tasks List: Tuesday, June 27, 2023
Try to search more workaround on Out-of-Memory
problem in github actions. Will work on new model refactoring eos46ev
tomorrow. Model eos2v11
is still under working, so once it done, i'll test it.
Thanks.
Thanks @ZakiaYahya - awesome progress
Hello @GemmaTuron @DhanshreeA
Tasks List: Wednesday, June 28, 2023
Model eos2re5: After changes and open PR again, it still fails at "Upload to DockerHub" but now it's not giving previous error i.e. CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
, now it's giving different error but still related to conda
i.e. 'CondaEnvironmentService' object has no attribute 'pid'
. Still working on this.
Didn't able to do some work on Out-of-Memory issue
in model eos44zp
, will do it once done with model refactoring eos46ev.
Thanks.
@ZakiaYahya !
Great progress, a lot on your plate so don't worry, focus on the issues one by one. If you need help with the eos46ev ping me and I'll try to provide further guidance I won't be assigning new models to you so you can focus on the eos44zp
Thanks @GemmaTuron
I'll try to first test eos46ev with latest ersilia version. Then i'll let you know. It seems like @febie is also encountering the same problem as i encountered in eos46ev. Regarding model eos44zp, i'm thinking of trying swap-space
workaround rather than deleting pre-cached
tools as it deletes some of packages that we required like python.
Hello @GemmaTuron @DhanshreeA
Tasks List: Thursday, June 29, 2023
Tested eos46ev for various single smile inputs updated here. Now i'm focusing on above mentioned three unresolved issues. Thanks.
Thanks for the update @ZakiaYahya ! I'll test eos46ev on my side as well
Hello @GemmaTuron @DhanshreeA
Tasks List: Friday, June 30, 2023
Open PR again on modeleos44zp
with added swap-space. Working on conda-issue
in model eos2re5
. Let me know @GemmaTuron when you test single inputs for model eos46ev
, i've kept it in hold for now. Will start working on model refactoring eos6pbf
once done with conda-issue
.
Thanks.
Hello @GemmaTuron @DhanshreeA
Tasks List: Monday, July 3, 2023
Did model refactoring eos6pbf
and open PR on it today. Modified test-model.ym
l for model eos44zp
as discussed and open PR on it. Tested model eos7asg
and update it on relevant issue. Working on model investigation eos46ev
in detail, along with working on conda-issue
in model eos2re5
as well.
Thanks
Hi @ZakiaYahya
The space in git actions is not easy to solve, let's try to talk about it today in the meeting. Focus for today on investigating the eos46ev and the conda issue on eos2re5 and if you are done with all let me know!
Right @GemmaTuron, I'm working on it. Will let you know once it done. Thanks.
Sorry for hijacking the conversation here a bit, but @GrantBirki could you tell us how much persistent storage space do GitHub runners get on our plan? I couldn't find much that says anything about it. I ask because configuring any amount of swap space would be limited by how much storage we actually have. And can we increase it, if yes, how? And is it possible to increase it specifically for a single model repository and not all of them?
Hello @GemmaTuron @DhanshreeA
Tasks List: Tuesday, July 4, 2023
Working on model eos2re5
, did all changes suggested by Miquel in today's meeting, now testing it both locally and inside ersilia using --repo_path. Tested it with smaller input file, now testing it with whole eml_canonical. Meanwhile working on model eos46ev
as well, investigating why it is not working with eml_canonical.csv. Open model request on one of the CYP450 enzyme i.e. CYP2CP, once approved, i'll start working on it as well.
Thanks.
Hi @ZakiaYahya
Quite a lot of work on refactoring eos44zp and debugging eos2re5, focus on this before moving onto new tasks
Right @GemmaTuron Working on it.
Hello @GemmaTuron @DhanshreeA
Tasks List: Wednesday, July 5, 2023
I've done testing model eos2re5
after suggested changes, it is working bothlocally
and with --repo-path
. Done refactoring Model eos5jz9 (CYP2CP)
, it is working locally but with --repo-path
it is failing throwing ModuleNotFoundError: No module named 'sklearn'
, i'm working on it, once it is working, i'll open PR on it. Apart from that, i'm also side by side working on model eos46ev
but didn't able to resolve it yet. It seems like it is failing on some of the smiles of eml_canonical, figuring it out which smiles causing the problem.
Thanks.
Hi @ZakiaYahya
Thansk for the update, let me know if you need anything. I've merged the pr on eos2re5
Sure @GemmaTuron Working on separate models for CYP right now.
Hello @GemmaTuron @DhanshreeA
Tasks List: Thursday, July 6, 2023
Done incorporating model eos5jz9
and eos7nno
and opened PR as well on these models. Tested model eos7asg
on CLI, COLAB and DockerHub. Working on 3rd model incorporation eos3ev6
but getting error while running it with --repo_path, Once it done i'll start working on eos46ev
.
Thanks.
Hi @ZakiaYahya
I have answered you in Slack, it seems there is an emoji causing trouble? that is surprising but it seems the reason! your plan of work looks good!
Hello @GemmaTuron @DhanshreeA
Tasks List: Friday, July 7, 2023
Done model incorporation eos3ev6 (CYP3A4) and open pull request. All three models divided from model eos44zp
are incorporated in Ersilia Model Hub. What to do next with eos44zp
model then??
After digging into the detail of model eos46ev, i separated out smiles that are causing problem at prediction time. They are 7 in number out of 443 from eml_canonical. Working on it to identify NAN or infinity values and handle them appropriately. What should be more convenient; Discarding problematic inputs or replacing NAN/infinity entries with zeros??
Model eos2re5 again failed at "upload to docker Hub"
, I've discussed it with miquel, we will figure it out on monday in detail.
I've tested model eos4se9 and it is working fine. Model testing eos2thm is still in loop, will do it later. Thanks
Hi @ZakiaYahya
Fantastic work as always. Let me quickly answer about eos46ev. We should not discard problematic inputs. That is, we need to ensure that we always have the same number of inputs as outputs.
Hello @miquelduranfrigola Thanks. Yes, to ensure same number of output entries as input entries i'm not skipping them, i'm just replacing NAN with zeros.
Hello @GemmaTuron Kindly merge PR on model eos3ev6 https://github.com/ersilia-os/eos3ev6/pull/1 Thanks.
sure! I'm waiting for the checks to be completed
Hello @GemmaTuron @DhanshreeA
Tasks List: Monday, July 10, 2023
sudo-conda-issue
in Model eos2re5Model eos46ev: It is working absolutely fine locally with run.sh
, giving output probabilities for all inputs, dealing with smiles having NAN values as well by replacing that NAN values with zeros. But somehow it is behaving weird when test it within Ersilia using --repo-path
. It is skipping some of inputs and returing remaining smiles and it's corresponding probabilities. I get those smiles which are skipped and strangly those are not even those smiles that have NAN values. Still can't figure out the problem. Need help here.
Model eos2re5: Discuss it with Miquel and he suggests doing changes in Ersilia code base
as this model uses sudo
commands in dockerfile which ultimately continuously failing with Ersilia. So, Miquel did some changes in Fetch-> get.py
code to discard sudo automatically from commands for root-users. I've push the changes in Ersilia code base and opened PR on it, Miquel is now testing the github actions. Hope it will works.
Mode Testing: Done quite a lot of model testing today, one model i.e. eos59rr is still in the loop. Thanks
Hello @GemmaTuron @DhanshreeA
Tasks List: Tuesday, July 11, 2023
Model eos2re5: Miquel did some changes in the dockerfile and Ersilia code space to make it works, Github Actions are running, Hope it will upload to DockerHub this time as it works accordingly to the changes did in it.
Model eos46ev: I tested it even with the commits even i refactoring the model and it is skipping some of the smiles at prediction time. So, it means this problem didn't arises after refactoring. Although Miquel dig into it and finds out that smiles are not reading properly and he did some changes in the main.py for reading smiles. Although the model passes all Github Action workflows but i just tested it by fetching the latest code from Ersilia and it is skipping a lot of smiles at prediction time.
Model Testing: Did Model testing today for models eos59rr and 24ci. Thanks
Thanks for the update @ZakiaYahya
The new cyps models are all ready, just awaiting for final test. We still need to think more about the issue with eos46ev. I've assigned you two new models meanwhile
Alright @GemmaTuron Yeh i skipped working on model eos46ev for a while, waiting for you to test. For today, i start working on new models. Thanks.
Hello @GemmaTuron @DhanshreeA
Tasks List: Wednesday, July 12, 2023
Model eos4tcc: Quiet a easy-pesy model, didn't require much refactoring. Tested it before and after refactoring with both run.sh
and --repo-path
and it is working fine. I've open PR on it as well.
Model eos1579: Start with testing the model before doing any refactoring, it is working fine locally with run_predict.sh
but it is failing with --repo-path
while fetching. I've updated the issue, kindly have a look https://github.com/ersilia-os/eos1579/issues/1#issuecomment-1632852573
Model eos46ev: Done quite a lot of testing from yesterday with the commit even before refactoring the model and it shows that weird behaviour too when test it with --repo-path
. Waiting for @GemmaTuron to test it with bigger number of smiles.
Thanks.
Hello @GemmaTuron @DhanshreeA
Tasks List: Thursday, July 13, 2023
Model eos46ev: Re-testing it on COLAB and on DockerHub, checking it if it reproduces the same behaviour that it shows on CLI or not. Model eos5179: Debugging it log files and temporary ersilia files to check the cause of error. Once resolve that error and make it work, after that start refactoring the model. Model eos2fy6: Tested it on COLAB, CLI and DockerHub
Thanks.
Hi @ZakiaYahya !
I'll try to help with eos1579 this afternoon, and we'll continue on eos46ev as well, don't worry we'll figure it out, I see you also have eos4tcc assigned, in case you are too stuck with the above, try that one while we try to help you!
Hello @GemmaTuron Okay sure, i'm still working on eos1579 along with @DhanshreeA to figure it out. Debugging eos46ev as well. I've opened PR on eos4tcc 2 days ago, kindly check it here https://github.com/ersilia-os/eos4tcc/pulls. Thanks.
Hello @GemmaTuron @DhanshreeA
Tasks List: Friday, July 14, 2023
Model eos4tcc: I've refactored the model and it is working fine both with run.sh
and witrh --repo-path
, I've open PR on it but i think somehow you missed it @GemmaTuron, Kindly check it.
Model eos46ev: So we have encountered two problems in this model i.e. (1) Skipped smiles in the ouput means output entries is not equal to input entires (2) Null inputs at prediction time. Incorporated the Miquel suggestion on changing the way to read smiles from input file and add code snippet to handle NAN values replaced with zero so it won't give null in the output. The code is working fine with the changes. I've open PR on it, Kindly check it.
Model eos1579: Couldn't able to resolve it yet, discuss and debugged it with @DhanshreeA and it seems like their is nothing wrong with input/ouput processing, something wrong with service.py and the request returns from bentoML, i'm digging into the detail of it. Suggestions are required.
Meanwhile @GemmaTuron, you can assign me new model as well to work on it along with eos1579. Thanks.
Summary
Hello,
This is a public issue for a virtual daily stand-up. We will use this to briefly share the tasks of the day and the challenges and advances made, so that we can ensure smooth support from the Ersilia mentors and alignment between daily tasks and overall internship goals.
Scope
Initiative 🐋
Objective(s)
Internship goals:
Team
Timeline
Before starting your work, line up a few tasks and short description. This should not take long. For example, it could be something like: Wednesday 21st June
Documentation
No response