Open Christof23 opened 7 months ago
This PR addresses the above issue but not sure if updates to layer_norm
are appropriate https://github.com/huggingface/candle/pull/1888
I'm running into the same issue -- I followed the instructions in the Candle reference guide to see how to run a HuggingFace model in Candle and I was surprised to see that the steps they recommend (loading the bert-base-uncased
model into the BertModel
struct) result in an error.
Hi, I was running the BERT example code and noticed that some of the variables weren't correctly aligning with the current Safetensors obtained via:
For example the model spec in
candle-transformers/src/models/bert.rs
results in:Error: TensorNotFound("embeddings.word_embeddings.weight")
.The Safetensors version prepends all variables with
bert
and uses the oldergamma
/beta
notation. This issue has also been noted here.I think the problem is in
layer_norm
which doesn't expect gamma and beta but weight and bias:The Safetensor variables are as follows: