Support local models - Githubissues

petebankhead commented 1 year ago

I suspect people will want to run models that they have locally, without needing them to be on the zoo.

@kaczmarj is there any pattern for doing this with WSInfer generally that we should follow?

kaczmarj commented 1 year ago

there is a pattern to do this with WSInfer.

one would need two files: a torchscript file of the model and a JSON config file. the torchscript file can be made with torch.jit.script(model) and then torch.jit.save. the JSON config is the same schema as the config files in the wsinfer model zoo. for example, here is the config for the breast tumor model we have in the zoo: https://huggingface.co/kaczmarj/breast-tumor-resnet34.tcga-brca/blob/main/config.json

the config must adhere to the schema defined at https://github.com/SBU-BMI/wsinfer-zoo/blob/main/wsinfer_zoo/schemas/model-config.schema.json

on the command line, one can validate that their JSON config is valid using wsinfer-zoo validate-config config.json.

i have included some documentation on this process at https://wsinfer.readthedocs.io/en/latest/user_guide.html#use-your-own-model but this is not complete.

xalexalex commented 1 year ago

Just as a quick note: it already works with local models, provided one is careful in editing the config.json manually. The only problem is that currently it can only work by substituting the files inside one of the predefined models. Thus I guess that if you changed the dropdown in the WSInfer extension window and made it scan a local folder (in addition to using predefined hardcoded models), it could already work.

p.s. congrats and thanks for this super-useful integration!

(edited to add link to video demonstrating custom model usage)

petebankhead commented 1 year ago

Thanks @xalexalex I'm glad that works!

@alanocallaghan do you have any thoughts on how we can add better support for local models?

alanocallaghan commented 1 year ago

Not really, the suggestion of adding a user directory that populates the dropdown alongside the Zoo models sounds sensible. Probably with the convention of:

user_dir
- model 1
- config.json
- torchscript_model.pt
- model 2
- config.json
- torchscript_model.pt

For simplicity's sake?

vipulnj commented 1 year ago

I replaced a downloaded model with a local .pt model (and corresponding config.json) as suggested by @xalexalex , but ran into some hurdles. It appears the issue stems from a backend mismatch – the model was trained on CUDA but is being utilized on MPS. Even attempting CPU inference didn't resolve the issue.

On the other hand, the pre-packaged models integrated seamlessly. I'm keen on identifying if there might be a minor oversight/misstep/ignorance on my end that's causing this hiccup.

Also, whenever there's an update concerning this feature, it would be greatly appreciated if some documentation could be provided regarding cross-platform compatibility.

Thank you for your efforts on qupath and WSInfer. Your work is invaluable and much appreciated.

kaczmarj commented 1 year ago

@vipulnj -

It appears the issue stems from a backend mismatch – the model was trained on CUDA but is being utilized on MPS. Even attempting CPU inference didn't resolve the issue.

can you try putting the model on the CPU before saving the torchscript file? something like this

model = model.cpu()
model_jit = torch.jit.script(model)
torch.jit.save(model_jit, "torchscript_model.pt")

Also, whenever there's an update concerning this feature, it would be greatly appreciated if some documentation could be provided regarding cross-platform compatibility.

this is a great idea, and i agree it should be done. i will need to learn more about cross-platform compatibility first and test out different configurations

Thank you for your efforts on qupath and WSInfer. Your work is invaluable and much appreciated.

we appreciate the kind words 😄

xalexalex commented 1 year ago

@vipulnj you ran into untested territory since I don't currently own a GPU so my models are all trained on CPU and I could never have run into this error 😅 anyway, did you also edit the lfs file (I think it's something like lfs-storage.txt but I don't have it on hand to check)? You should edit the sha256sum and file size to match your scripted model file.

vipulnj commented 1 year ago

An update after following what was suggested ::

Fixing the oid sha256: and size fields in the lfs-pointer.txt (for my .pt file) sort of helped. It no longer prompts me to download a new model file.

I also rebuilt the .pt file using model_jit = torch.jit.script(model.cpu()).

To test this, I selected a small size annotation and hit Run (on CPU), I saw the CPU utilization go up and then qupath stops responding.

qupath / qupath-extension-wsinfer

Support local models #32