ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
220 stars 147 forks source link

[Internship Project]: Zakia Yahya #713

Closed GemmaTuron closed 1 year ago

GemmaTuron commented 1 year ago

Summary

Hello,

This is a public issue for a virtual daily stand-up. We will use this to briefly share the tasks of the day and the challenges and advances made, so that we can ensure smooth support from the Ersilia mentors and alignment between daily tasks and overall internship goals.

Scope

Initiative 🐋

Objective(s)

Internship goals:

Team

Role & Responsibility Username(s)
Intern @ZakiaYahya
Mentor @DhanshreeA
Coordinator @GemmaTuron

Timeline

Before starting your work, line up a few tasks and short description. This should not take long. For example, it could be something like: Wednesday 21st June

Documentation

No response

ZakiaYahya commented 1 year ago

Hello @GemmaTuron I'm working on resolving dependancy clashes with protobuf in model eos1579, kindly assign me some new model as well to work on it side by side. Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Monday, July 17, 2023

Model eos1579: Today i tried different versions of signaturizer (1.1.11 and 1.1.13), as tensorflow-Hub install automatically with version 0.14.0 and it requires protobuf version >=3.19.6 which is not compatible with bentoML as it requires protobuf version >=3.8,<3.19, so here the clash occurs. As ersilia automatically resolves clashes that occur between bentoML and protobuf so i search on tensorflow-hub that fits with protobuf version that is compatible with bentoML too. The installed version of tensorflow-Hub is 0.14.0 which is the latest one so I tried degrading the tensorflow-Hub manually in Dockerfile to version 0.12.0 which requires protobuf version 3.17.3 which is also compatible with bentoML requirements but somehow it is not working either, i'm still working on it.

Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Tuesday, July 18, 2023

Model eos1579: Done quite a lot of debugging in resolving protobuf-bentoML-tensorflow-HUB dependancy issue but the problem that causes the model to fail in Ersilia CLI is in service.py, as the service.py is doing slicing on wrong input list which is empty that's why it is giving Cannot choose from an empty sequence. Model is working now and i've open PR on it as well,, i've explained all changes and how i resolved errors in detail here

Will start working from tomorrow on Clean UP & Dockerization eos481p, Debugging Model eos7ack and Model Testing eos5axz. Thanks.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Wednesday, July 19, 2023

Tested above models and updated on relevant issues. Working on model refactoring eos481p.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Thursday, July 20, 2023

Model eos481p: Done refactoring of model eos481p and open PR on it as well. Model eos5axz: Re-tested the model as yesterday it was giving me errors and it was due to AirTable as it was down temporarily. It is working fine on CLI, COLAB and DockerHub. Today's Meeting: Prepare slides for today's presentation and attend the weekly lab meeting

Kindly assign me new models to work on it. Thanks.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Friday, July 21, 2023

Model eos8a5g: Refactor the model and tested it with few smilesand even with eml_canonical to ensure that model is working. Tested the model both with run.sh and --repo-path before and after refactoring. The model is working fine. I've Open PR on it. Model eos9taz: Tested the model before refactoring but didn't understand the outputs it returns. Understanding the model layout. Model eos7ack: Cloned the repo and tested it with run.sh and --repo-path and with bothfew smiles and with eml_canonical. It seems like that the model is working fine with few inputs but not not with larger inputs but the problem didn't lies in model coding, it basically the SwissAdme that process few inputs at a time, with larger inputs the server return empty response and hence getting error "empty sequence". But i'll dig more into this model to confirm this thing.

Thanks.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Monday, July 24, 2023

Model eos2re5: Testing it again on CLI, DockerHUB and COLAB to check the issue using ersilia fetch api. Model eos9taz: First stuck in model understanding conceptually. After that, i'm now refactoring the model, as main.py requires a bit a lot changes in it because initially the code didn't contain any bash file to run main.py. main.py is using argument parser to take Model_path, input_file and Output_file. Need to remove that part so that arguments can pass through run.sh, i'm working on it.

Thanks.

GemmaTuron commented 1 year ago

Hi @ZakiaYahya

Thanks, I've added a new model in case you finish the current one

ZakiaYahya commented 1 year ago

Right @GemmaTuron Almost done with model eos9taz, just making final commits then will start working on new model. Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Tuesday, July 25, 2023

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Wednesday, July 26, 2023

Model eos4q1a: I have tested the model before doing refactoring the model and it is working fine with run.sh, giving upto 100 generated molecules as a separte columns in output file but when i tested it with Ersilia it is giving me correct output, giving me upto 100 generated molecules as well but in a list format in a single column , not in individual columns. So i'm trying to fix it in service.py before doing refactoring. Here's the detail https://github.com/ersilia-os/eos4q1a/issues/12#issuecomment-16523762

Model eos2re5: Figure out theFirst-null-entry issue in the output. It basically stores an extra smiles column in dataframe, which we don't needed in case of fetching from Ersilia as ersilia automatically appends key and input column in a output and input column is basically smiles. So, i just discarded storing smiles in a dataframe inmain.py and it is working fine with run.sh, just need to ensure that by testing it with repo-path, so currently i'm fetching the model which took time in case of eos2re5 due to dependancies installation, once it done i'll Open the PR again on it.

Thanks.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Thursday, July 27, 2023

Model eos9be7: I have start working on the model, the model is not working even before refactoring due to input format it requires, i have posted some queries on the issue, kindly check it at https://github.com/ersilia-os/eos9be7/issues/1#issuecomment-1654078052

Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Friday, July 28, 2023

Model eos9be7: I have done nothing much today due to unavailability of internet. Trying to parse the data from json input file and getting the fcd scores. The json file is loaded successfully but the fcd is taking it as atom by atom not smiles by smiles. So, basically i'm stuck here. Going through original code to figure it out.

Latest Update: Able to figure it out the correct format of json file to input to main.py, The json file should contain a list that has pair of smiles, for calculation of fcd scores. I've updated in detail at the issue here

Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Monday, July 31, 2023

Model eos9be7: Model is working withrun.sh for few smiles as well as for eml_canonical. Incorporated following things in main.py

  1. To start taking CSV file of smiles as an input instead json.
  2. Convert smiles from CSV to pair of smiles so that model handles the input format. In order to avoid the conflict of odd number of smiles from CSV, i incorporated this approach such that if there is N number of smiles in a CSV file, the number of pairs created in JSON file is N-1 regardless of even/odd number of smiles
    [
    ["SMILES1", "SMILES2"],
    ["SMILES2", "SMILES3"],
    ...
    ]
  3. Store that pairs of smiles in a JSON file and save it in a working directory at the backend for better understanding of pairs creation for new users.
  4. Read that JSON file and handle exceptions like if data is not in a correct format or the file is not present in the directory.
  5. Delete JSON file if needed.

I'm now trying to make this main.py run with --repo-path. Thanks.

GemmaTuron commented 1 year ago

Hi @ZakiaYahya Great job, I have merged all your PRs, and assigned some models to test

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Tuesday, August 1, 2023

Today most of the time spend meetings and in debugging eos9be7 for input format, and understanding the bentolite.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Wednesday, August 2, 2023

Model Testing: Tested all assigned models i.e. eos1bba, eos157v and eos9c7k on CLI, COLAB and DockerHUB Model eos9be7: Make it work for multiple inputs in JSON format with runs.h. Replace NAN/inf fcd_scores with None in output, having few queries that i updated at github issue here Bentolite: Didn't able to work on it, as i was stuck in Docker issue and Colab pyairtable issue and then in model testing. After that did worked on model eos9be7.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Thursday, August 3, 2023

Working on eos9be7, the model is working with both fcd_canonical andfcd_canonical_smiles with run.sh. Worked on Bentolite today. Working on model eos9taz to format the output structure.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Friday, August 4, 2023

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Monday, August 7, 2023

Model eos9be7: I did model refactoring today, The model is working fine both with run.sh and --repo_path but with JSON file as an input. Waiting for final comments either to make changes to make it work with CSV file as an input as CSV is not parsing that smiles that have dot . operand in it but JSON is. Once it decide i proceed it further.

Model eos9taz: I debugg output format, and the problem is in service.py as it stores generated smile in a list in a output file rather than just simply gives string of smile in a separate column, i did necessary changes in it and make it work so that it simply return a smile string without list in the final output.

Bentolite: Working on Bentolite. Had a meeting with miquel today, and we just start testing the changes we have made in Bentolite with model eos7uix which is test model, just to ensure that whether the refactored Bentolite works with Ersilia or not. It is not working with Ersilia rightnow, so we are just debugging things and doing further refactoring.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Tuesday, August 8, 2023

GemmaTuron commented 1 year ago

Hi @ZakiaYahya

I think there are a few model testings assigned to you inaddition to the BEntoLite task. Let me know how you are progressing there! Also, in addition to your tasks, please try out the test module developed by Riley and Febie and provide feedback in the Slack thread, specifying which model did you test, and if you have any further suggestions for improvement. Thanks!

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Wednesday, August 9, 2023

Model eos9be7: I've modified the main.py to read CSV as an input. and it is working fine with run.sh but with --repo-path i got errors while parsing smile_1 and smile_2 in Ersilia CLI while fetching. First i passed this step then i'll try to discard the smiles having dot inside.

Bentolite: Working on bentolite to test model eos7uix but it is giving errors as some config files i.e..cfgfiles are not actually copying from Bentolite to the installed version of Bentolite inside model env created while fetching the model in Ersilia CLI.

Model Testing: Tested the model eos43at in CLI, COLAB and DockerHub. It is working and giving consistent output as well.

Test module: I've tested the new "test module" using model eos3b5e and eos8a5g and the it is working perfectly.

Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron Yes, i have one model assigned as Model Testing i.e. eos43at, which i'll do it by today. Just updated my Task List for today. Apart from that i'll test the Riley's model as well. Thanks.

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Thursday, August 10, 2023

Model eos9be7: So, i've make the code work with run.sh with changes in main.py so it takes multiple inputs from CSV file rather than JSON. But as service.py is working woth JSON, so model is failing at fetch time with --repo-path. So, i did changes in it but somehow it still not processing smile_1 and smile_2 format at Ersilia CLI. Need help there because i'm stuck here now but i'm trying still to make it work.

Bentolite: Didn't able to resolve configuration files issue as it is not copying while installing, Need suggestion from Miquel to proceed.

Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Friday, August 11, 2023

Model eos9be7: Trying to figure out why Ersilia is not processing multiple_inputs as in my understanding from serve.log there is something wrong with service.py or it's inference_api.py that is not parsing smiles in a format smile_1 and smile_2. Looking further into it. Suggestions required.

Bentolite: Just had a meeting with Miquel and discussed .cfg problem with him, He's now further inverstigate the problem and then i'll proceed it.

Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Monday, August 14, 2023

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Tuesday, August 15, 2023

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @DhanshreeA

Tasks List: Wednesday, August 16, 2023

GemmaTuron commented 1 year ago

Hi @ZakiaYahya ! It was great to work with you, thanks so much for your contributions to Ersilia, we hoped you learnt and enjoyed as much as we did! Please, remain engaged with the community and feel free to open any issues or contribute to open ones :) I'll now close this issue as completed !