huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.74k stars 26.45k forks source link

Remote code improvements #27688

Closed LysandreJik closed 9 months ago

LysandreJik commented 10 months ago

Originally posted by @Jackmin801 on the awesome jinaai/jina-embeddings-v2-base-en repository.

  1. Documentation The clarity of the transformers documentation on custom models could be improved by describing the syntax of auto_map in config.json. We never used the register functions and would just directly modify the config.json. The behaviour that allowed one to use code on the Hub from another repo using "--" doesn't seem to be documented anywhere but we figured out that it was possible because save_pretrained saves in this format when using a custom model. The feature does seem to be pretty new though (I believe ~6 months ago?) so maybe that is why it hasn't been too well documented yet. But we think that if it was better communicated to users that it was possible to do this, more people would develop on the Hub as we did.

  2. Promote trust_remote_code to environment variable Some downstream libraries currently do not support passing the trust_remote_code argument. Notable to our work was sentence_transformers, despite quite a few requests for this [1, 2, 3]. This leads to us needing to monkeypatching the model loading logic in the libraries to be able to use our model. If trust_remote_code could be read from an environment variable e.g. HUGGINGFACE_TRUST_REMOTE_CODE, it would make it such that one only need set the environment variable to enable loading custom models. This would make the use of custom models much easier to adopt throughout the ecosystem.

  3. Silent failure when trust_remote_code is not set to True. When trust_remote_code is not set to True for our model, the behaviour seems to be that it loads the classic BERT implementation from transformers and throws a bunch of warnings from re-initialised weights. This is not ideal because if a downstream evaluation script forgot to set the arg, it would generate inaccurate results and the only way of knowing that something was wrong was to scroll through the output logs and see if this warning appeared or print the model and see if the model has the right name. If instead, it would error and ask the user to set the trust_remote_code arg, it would be more easily caught and save us quite some head scratching and communication overhead in the team.

Definitely agree with all the points above; would be awesome to work on this.

LysandreJik commented 10 months ago
  1. was resolved by @MKhalusova in https://github.com/huggingface/transformers/pull/27213, it will be in the next release :hugs:
github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.