Closed Atry closed 1 year ago
Makes sense to me. On thought HF transformers has high redundancy in their code, keeping inheritance footprint rather minimal. https://github.com/huggingface/transformers/tree/main/src/transformers/models This enabled running and integrating the latest changes fast.
Changes in this PR is already in our internal repository and get tested there. The possibility to break this repository is minimal but not zero because of the lack of publicly visible CI. I am merging this PR. If you encountered any issue, feel free to report it.
Exactly, I think modularization is a goal that we want to do different from HuggingFace's repositories.
Currently
BLOOMSharded
is a subclass ofCausalLM
, while it skipsCausalLM
's constructor. This is a surprising behavior that we might want to avoid.This PR extracts
CausalLM
's constructor toAutoCausalLM
to detect settings frommodel_id
, so that we don't have to skipCausalLM
's constructor in custom models.