ersilia-os / eos2lqb

GNU General Public License v3.0
1 stars 0 forks source link

New model ready for testing! #2

Closed github-actions[bot] closed 1 year ago

github-actions[bot] commented 1 year ago

This model is ready for testing. If you are assigned to this issue, please try it out both on the CLI and Google Colab and let us know if it works!

GemmaTuron commented 1 year ago

@ahmedyusuff and @whoisorioki can you test this model please?

Thanks!

whoisorioki commented 1 year ago

@AhmedYusuff and @whoisorioki can you test this model please?

Thanks!

I'm on to it!

AhmedYusuff commented 1 year ago

Hi @GemmaTuron.

EOS2lQB MODEL TEST

I Tested the Model on my Ubuntu 22.04 system and Google Colab.

Result

whoisorioki commented 1 year ago

Hey @GemmaTuron, here are the updates:

👉 Available APIs:

💁 Information:

GemmaTuron commented 1 year ago

Hi @whoisorioki

This model was ready to be used, testing should work only with the three Ersilia commands fetch, serve and api, no changes in the code needed

whoisorioki commented 1 year ago

Oh okay, thank you for that infomation @GemmaTuron.

GemmaTuron commented 1 year ago

@HellenNamulinda

Upon looking at the output provided by @AhmedYusuff, I see that we are only giving high or low for each cut-off, when we would actually like to get the probability of high for each cut-off. This probability must be given by the model in order to classify molecules as high or low Do you think you can identify the piece of code doing that conversion and working on it to provide the probabilities as a number instead?

HellenNamulinda commented 1 year ago

@HellenNamulinda

Upon looking at the output provided by @AhmedYusuff, I see that we are only giving high or low for each cut-off, when we would actually like to get the probability of high for each cut-off. This probability must be given by the model in order to classify molecules as high or low Do you think you can identify the piece of code doing that conversion and working on it to provide the probabilities as a number instead?

Hello @GemmaTuron, Let me update the code to return the probabilities, i.e P(high), and P(low)

GemmaTuron commented 1 year ago

Great many thanks @HellenNamulinda !

I think with the Probability of High it is enough, so it will be only one value for the cutoff 20% and one for th 50%, what do you think?

HellenNamulinda commented 1 year ago

Great many thanks @HellenNamulinda !

I think with the Probability of High it is enough, so it will be only one value for the cutoff 20% and one for th 50%, what do you think?

Oh yeah, that's right. Also, if inference is made for an entire file, where the output is a csv file, it is easier to understand because the column names show. But, if someone is using CLI for one molecule, I think it can be quite hard to understand if the outcome printed is just two values say

"output": {
        "outcome": [
           0.22, 
          0.37
        ]
    }

where 0.22 is P(high) for cutoff 20% and 0.37 is P(high) for cutoff 50%. This is because column names are not printed, and that's why I had prepended HOB(20%): and HOB(50%): to the values saved.

Do you think printing the output like that is okay, and just provide detailed explanation on the interpretation in the README.md(from metadata.json)?

HellenNamulinda commented 1 year ago

Great many thanks @HellenNamulinda ! I think with the Probability of High it is enough, so it will be only one value for the cutoff 20% and one for th 50%, what do you think?

Oh yeah, that's right. Also, if inference is made for an entire file, where the output is a csv file, it is easier to understand because the column names show. But, if someone is using CLI for one molecule, I think it can be quite hard to understand if the outcome printed is just two values say

"output": {
        "outcome": [
           0.22, 
          0.37
        ]
    }

where 0.22 is P(high) for cutoff 20% and 0.37 is P(high) for cutoff 50%. This is because column names are not printed, and that's why I had prepended HOB(20%): and HOB(50%): to the values saved.

Do you think printing the output like that is okay, and just provide detailed explanation on the interpretation in the README.md(from metadata.json)?

Greetings @GemmaTuron, The changes have been made and merged. I fetched the new model and ran predictions on the eml_canonical.csv dataset. The output file is; output.csv