Open taylorreiter opened 8 months ago
Hi @taylorreiter, I just found this project, it seems really cool. I've just released an update to AutoPeptideML (0.3.1) to address this issue. Now, you can calculate the representations one time with:
df_repr = re.compute_representations(df.sequence, average_pooling=True)
and then run the predictions taking the additional argument of df_repr:
predictions = autopeptideml.predict(
df=df, re=representation_engine, ensemble_path=model_folder, outputdir=tmp_dirname,
df_repr=df_repr
)
This should allow you to run the code in a loop of some sort and avoid calculating the embeddings every time.
Thank you @RaulFD-creator!
As @keithchev pointed out over in #10:
I think it could be simple to run each autopeptideml model in one script, which would then generate the ESM embeddings only once.
Similarly, keith mentioned:
This will be something to keep an eye out for.