ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
189 stars 123 forks source link

🐛 Bug: Invalid SMILES producing a Null output #941

Open Richiio opened 5 months ago

Richiio commented 5 months ago

Is your feature request related to a problem? Please describe.

Yes, it is related to a problem. I incorporated a model recently that worked perfectly well for a SMILE with correct input but failed for a SMILES with an incorrect input. Here is a link to the model incorporated https://github.com/ersilia-os/eos1mxi

Describe the problem:

The current model implementation encounters a null output when provided with an inaccurate SMILE.

Describe the solution you'd like.

I am looking at a possible solution where the ersilia model hub checks the input of a user, ensures it is a correct SMILES before attempting to run predictions with the model. This could be part of the checks ran in the github actions ensuring that only valid SMILES are processed by the model.

Describe alternatives you've considered

While implementing input validation in the GitHub Actions workflow is one solution, alternative approaches may include incorporating input validation directly within the model's code or providing a separate input validation endpoint that users can query before submitting SMILES for predictions (This would be for the web UI)

GemmaTuron commented 5 months ago

Hi @Richiio

I need to understand this better. Can you please provide an example of the input and output you are getting in the correct and incorrect case? And better explain this: The current model implementation encounters errors when provided with incorrect SMILES input. This can lead to unexpected behavior and inaccurate predictions. which errors does it encounter? what inaccurate predictions? we should not output a prediction if the input is not correct In principle Ersilia already checks for the validity of the input already so I don't know exactly what else would we want to do here @miquelduranfrigola ?

Richiio commented 5 months ago

Hi @GemmaTuron, that was a wrong description on my part. Apologies for that

miquelduranfrigola commented 5 months ago

Hi @Richiio thanks for the issue.

Can you provide run, in the same input file one good molecule and one bad molecule and attach the output here, as a csv?

Richiio commented 5 months ago

@miquelduranfrigola Input file input.csv Corresponding output file output (5).csv

miquelduranfrigola commented 5 months ago

OK, thanks @Richiio ! Super useful. Now I see what you mean.

Let's discuss this with @GemmaTuron and @DhanshreeA , not sure what is the best format we would want. I agree that the current solution, in this case, is not the best. This is probably something we should do discuss in an online meeting.

miquelduranfrigola commented 5 months ago

@GemmaTuron what is the current status of this?

GemmaTuron commented 5 months ago

We can add it to the agenda for discussion on Tuesday!

miquelduranfrigola commented 5 months ago

Also, can we please add a title to this issue to help all of us keep track of it? Thanks!

DhanshreeA commented 4 months ago

I am labeling this low priority since it would be nice to have model output from ersilia cli in a more meaningful format in case of invalid smiles/garbage input, however it is not a breaking issue as of now. We will take this up soon!