lbcb-sci / RiNALMo

RiboNucleic Acid (RNA) Language Model
https://sikic-lab.github.io/
Apache License 2.0
43 stars 6 forks source link

Inference #4

Closed alexandremarcil closed 2 months ago

alexandremarcil commented 2 months ago

Hi, I've had no issues installing and running your code. I am interested in using your model to do inference, for example predicting the MRL for various RNA sequences. I've tried modifying your code to do so without much success at the moment.

Any help you can provide to do such a task would be greatly appreciated. The idea is to pass a new csv dataset of RNA sequences and output the predictions of the finetuned model (MRL for example).

Thanks!

RJPenic commented 2 months ago

Hello, we are currently focused on some other aspects of the model and therefore downstream task inference script is not high on our priority list (at least right now).

You could potentially assign "dummy" labels for the entries in your dataset and use it as a dummy test set and then just save model's outputs in test_step method. For example, for MRL you could assign 0.0 label for all your sequences, save them into a CSV similar to the one we are using in our code and then just save model's outputs during testing.

Let me know if you have any additional questions. :)