Closed nickbhat closed 2 years ago
Hi @nickbhat there's a function intended for this: https://github.com/Merck/BioPhi/blob/f764b6188211eb2d4e6b4f091729f5ab89c7e406/biophi/humanization/methods/sapiens/predict.py#L36
However, looks like the return_all_hiddens
arg is actually ignored :) Do you want to create a pull request by any chance? I'm always happy to get more contributors.
It would be a simple fix here: https://github.com/Merck/BioPhi/blob/f764b6188211eb2d4e6b4f091729f5ab89c7e406/biophi/humanization/methods/sapiens/roberta.py#L76
Btw, this way we can only get the embeddings, if you also wanted the attention weights, that would require some changes to fairseq
code.
I'll take a look at this and submit a PR! Always happy to contribute :) I'll get around to it in a couple of weeks, if that's okay.
Fixed by https://github.com/Merck/BioPhi/pull/21/
Example usage:
from biophi.humanization.methods.sapiens.predict import sapiens_predict_seq
pred, extra = sapiens_predict_seq(
seq=seq, # seq should be the variable region sequence only
chain_type='H', # chain type is H or L
return_all_hiddens=True
)
embeddings_per_layer = extra['inner_states']
Hello,
Is there any ability to create an API for producing embeddings from input sequences? Users could implement this themselves if #19 turns out to be true. If not, perhaps an embedding API could be exposed without having to make weights publicly available?
Thanks!