Initializing via AutoImageProcessor before AutoProcessor is imported causes `AttributeError`

alex-jw-brooks commented 1 month ago

System Info

transformers version: 4.45.2
Platform: Linux-4.18.0-513.18.1.el8_9.x86_64-x86_64-with-glibc2.28
Python version: 3.10.4
Huggingface_hub version: 0.24.5
Safetensors version: 0.4.4
Accelerate version: 0.33.0
Accelerate config: not found
PyTorch version (GPU?): 2.4.0+cu121 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: No

Who can help?

Probably @zucchini-nlp @amyeroberts, and @qubvel

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

There seems to be an edge-case in the loading behavior that can sometimes be hit if something is initialized with from_pretrained through AutoImageProcessor before AutoProcessor is imported, and then from_pretrained is used on AutoProcessor.

Repro case - this works as expected

model_name = "microsoft/Phi-3.5-vision-instruct"
from transformers import AutoImageProcessor
from transformers import AutoProcessor
AutoImageProcessor.from_pretrained(model_name, trust_remote_code=True)
AutoProcessor.from_pretrained(model_name, trust_remote_code=True)

But this breaks:

model_name = "microsoft/Phi-3.5-vision-instruct"
from transformers import AutoImageProcessor
AutoImageProcessor.from_pretrained(model_name, trust_remote_code=True)

from transformers import AutoProcessor
AutoProcessor.from_pretrained(model_name, trust_remote_code=True)

with AttributeError: module transformers has no attribute Phi3VImageProcessor.

I've tried to reproduce this with a few other models, but haven't been able to yet. It is also probably worth noting that the AutoProcessor doesn't return the same class as AutoImageProcessor for this model, which might matter

Expected behavior

AutoImageProcessor / AutoProcessor from_pretrained should have the same behavior if possible, regardless of import and invocation order

LysandreJik commented 1 month ago

Seems like a bug indeed, thanks for the great report! cc @Rocketknight1, can you take a look at this?

Rocketknight1 commented 1 month ago

Confirmed the bug, and thanks for the clean reproduction script! I suspect this is an issue with how we construct the autoclass mappings, I'm working on it now.

alex-jw-brooks commented 1 month ago

Awesome, thank you both! 🙂

Rocketknight1 commented 1 month ago

After investigation, the issue seems to be here. Manually setting an attribute of the transformers library like that seems to break other stuff in our imports, and it interacts with the AutoProcessor import in ways I don't fully understand

Rocketknight1 commented 1 month ago

Final diagnosis:

The fundamental cause is our code here
This code grabs all the attributes (i.e. tokenizers, image_processors) that belong to a processor class, and gets arguments from each of them
However, to get the arguments, it tries to import those classes from the transformers root. As a result, it fails when the image processor class is custom code and not present in the transformers root.
The Microsoft team added the line here, obviously to work around this issue. However, this line has weird side-effects and breaks if we import classes in the wrong order, as @alex-jw-brooks noticed.
To fix this, we should rewrite _get_arguments_from_pretrained so that it can handle custom code classes correctly, and then make a PR to the Microsoft repo to remove the line that assigns the class as a root attribute of transformers.

cc @zucchini-nlp - can you see an obvious fix to that method that would work for custom classes? I can take this if you're overloaded!

zucchini-nlp commented 1 month ago

Hey @Rocketknight1, thanks for investigating this.

The quickest worrkaround I see is to change this line in Microsoft repo to be an AutoImageProcessor. Just tried it by modifying the code locally, works as expected because the AutoClass will be importable in any case. I am not sure though that forcing autoclass within our loading code would work in all cases, since some classes can be non-automappable? 🤔

But I agree that we need a better solution here than forcing folks on the hub to use AutoClasses. I cannot say right now what would be the best option, and yeah would be super cool if you have any ideas for long-term solution 😄

Rocketknight1 commented 1 month ago

Yeah, that's a really good point! Let me investigate this when I get a chance, maybe I can come up with a (semi-) clean solution.

github-actions[bot] commented 5 days ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

DarkLight1337 commented 5 days ago

Any update on this?

huggingface / transformers