Mordred (eos78ao) output formatting issue

JHlozek commented 3 months ago

Describe the bug I've identified the source of the issue I'm having in ZairaChem with the Mordred (eos78ao) model. The descriptors from the Mordred model are saved as a list '[]' whereas all the other descriptor models don't save the feature lists with the square brackets.

This causes issues after reading the raw.h5 descriptor file during further processing in zairachem.utils.matrices.py in save_summary_as_csv() where 'values[0]' in line 128 is not the expected 1-dimensional array and cannot be converted to floats.

I confirmed this by editing the code to reshape the array from (342, 1, 1613) to (342, 1613) which stopped this specific issue (but breaks the pipeline in other areas). So I suggest altering the output of eos78ao to be consistent with the other descriptor outputs (i.e. without explicitly saving the brackets)?

To Reproduce Steps to reproduce the behavior in Ersilia:

ersilia serve eos78ao
ersilia run -i test_smiles.csv -o mordred_out.csv
See descriptors saved with '[]'

Steps to reproduce the behavior in ZairaChem:

fresh install of ZairaChem and eos78ao
zairachem fit -i test_smiles.csv -m test_model.csv
See test_model/descriptors/mordred only has 'raw.h5' and no 'raw_summary.csv'
Navigate to test_model/descriptors/mordred and open a Python runtime
import zairachem
from zairachem.utils.matrices import Hdf5
h = Hdf5("raw.h5")
h.save_summary_as_csv()

Expected behavior ZairaChem completes post-processing of Mordred descriptors (in zairachem.utils.matrices.py in save_summary_as_csv()). I think this requires eos78ao to save its output without the square brackets in the feature list.

Desktop (please complete the following information): Both in Rocky Linux 9 and Ubuntu 20.04

JHlozek commented 3 months ago

I've been digging some more into this and discovered that the reason the descriptors are not read correctly is due to the eos78ao model's api_schema.json not getting configured correctly during the model fetch. Then the data is not stored as expected. So I think this is more an Ersilia model issue rather than ZairaChem.

The data type and shape are not specified and gets configured as: "output": { "descriptor": { "meta": null "type": "null", } }

If I manually edit the file to the following, then the model output is saved correctly and ZairaChem reads the data and proceeds as expected: "output": { "descriptor": { "type": "mixed_array", "shape": [ 1613 ], "meta": null } }

I've attached the fetch log from the hub. I see that there is a file that couldn't be fetched that may be related to production of the api_schema file?: "Error response from daemon: Could not find the file /root/eos/dest/eos78ao/model/framework/example.csv"

eos78ao_fetch.log

GemmaTuron commented 3 months ago

I can get the model to work fine when fetching from DockerHub (but not tested the model inside ZairaChem - maybe that pops up the error due to h5 conversion) I do get fetched a weird image artifact that might be due to the new image of ersilia? see the screenshot:

GemmaTuron commented 3 months ago

indeed with h5 it does fail

ersilia run -i my_input.csv -o my_output.h5 eos78ao_h5.txt

DhanshreeA commented 3 months ago

Hi @GemmaTuron this, to the best of my knowledge, seems to be a result of two issues - one from the ersilia CLI, and the other related to Docker itself. Ersilia thinks it is having trouble working with conventional pull which is truly not the case so it ends up pulling the same image in both ARM and AMD versions. And currently Docker is not mature enough with multi platform builds wherein even if both the architectures exist on your system, docker should list it as a single image. The default behavior within Docker is to reuse the label when two images are labeled the same way, thereby making one of them dangling (or none as you see here).

My best guess is that it is some combination of those things, but knowing your Docker version might be helpful as well!

GemmaTuron commented 3 months ago

ah that part makes sense, is there anything we can do to avoid it happening? I had docker 4.23 but updating to 4.29 right now

DhanshreeA commented 3 months ago

From our end, I need to work on the conventional pull issue so we don't end up pulling two flavors of the same image ever.

JHlozek commented 2 months ago

Just in case this got lost on Slack, the recent changes did fix some of the metadata processing issue but the api_schema.json is still not completely correct. The 'type' field is still null: api_schema.json

These are the molecules that were used during the fetch: example_standard_input.csv example_standard_output.csv

GemmaTuron commented 1 month ago

@DhanshreeA

Could we look into this and close the issue?

GemmaTuron commented 1 month ago

@DhanshreeA @miquelduranfrigola I thought it was because this model is using an old service.py file but I tried updating it and no changes, the api_json still looks old. When is the api schema created? Am I missing something here?

miquelduranfrigola commented 1 month ago

So the API schema is created at fetch time... humm...

Could you try adding a example.csv and an output.csv file in the framework folder? @DhanshreeA we could actually check whether this model works well with the ersilia-pack way, which simplifies a bit the api schema resolution thing...

But before that, @DhanshreeA , could you check that we you are able to fetch this model in its current form? I guess it is still not clear to me whether fetch works but schema data is not resolved completely (minor issue) or whether fetch doesn't work at all due to this.

I know my answer is confusing, so let's discuss this anytime you want.

GemmaTuron commented 1 month ago

The model works perfectly fine Miquel I have checked it last week both via github and docker

miquelduranfrigola commented 1 month ago

@GemmaTuron is this issue still open, then?

DhanshreeA commented 1 month ago

@miquelduranfrigola I think what @GemmaTuron means is that the model is fetching - and serving+ predicting - perfectly fine (which I can confirm that it does). However as you mention here in the second part of your comment, the schema is not fully resolved.

miquelduranfrigola commented 3 weeks ago

Thanks @DhanshreeA - based on our last week's work on schema resolution, is this solved now? Feel free to close the issue if so.

DhanshreeA commented 3 weeks ago

Hey @JHlozek, we've pushed a fix for this which should resolve the issue with eos78ao's API schema resolution. Could you experiment with it again within the context of ZairaChem and let us know if it works?

JHlozek commented 3 weeks ago

Hi @DhanshreeA

I pulled and installed the latest ersilia code but I still get the same api_schema.json issue when I re-fetch eos78ao: api_schema.json

In your environment, does your schema detect the data type as a "mixed_array" correctly?

miquelduranfrigola commented 3 weeks ago

This is surprising. Note that we did not update the model - we updated Ersilia. Did you pull the latest Ersilia version?

JHlozek commented 2 weeks ago

I just tried it again, this time creating a separate environment for Ersilia, pulling and installing the latest model hub version and re-fetch eos78ao and it still produces the same api_schema file as above.

miquelduranfrigola commented 2 weeks ago

Thanks @JHlozek - please give us some time to investigate it. Will come back to you soon.

DhanshreeA commented 2 weeks ago

Hi all, I think it was an issue with base ersilia docker image not having been updated, which I've updated a few times since the latest bug fixes. I'll confirm if the model docker build works as expected now and update here.

DhanshreeA commented 2 weeks ago

Hi @JHlozek and @miquelduranfrigola post updating all the docker images involved in the process, I can confirm that this works successfully with the desired api schema. You can give it a go @JHlozek :)

JHlozek commented 1 week ago

I've confirmed that I've pulled the model again and the api_schema is setup correctly. Nice work! :)

GemmaTuron commented 1 week ago

wonderful, let's close this issue then!

ersilia-os / zaira-chem

Mordred (eos78ao) output formatting issue #42