neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

[Fix] deepsparse – tokenizer stride size issue with original transformers #1425

Closed dbogunowicz closed 9 months ago

dbogunowicz commented 9 months ago

Fix for: https://app.asana.com/0/1205724081986514/1205993362080974 (the exhaustive explanation for this fix is in the ticket) @dbarbuzzi this fix works for me, please confirm whether it also solves your issue) Will cherry-pick once approved.

dbarbuzzi commented 9 months ago

The changes work with the minimal example reported. Quick question, though, is it expected that if the consumer uses a non-default sequence_length that they should also be manually defining the doc_stride? I ask because now if I e.g. set a sequence_length of 32 in the constructor with no other kwargs, it will fail the new check in the PR as the doc_stride simply defaults to 62 regardless.

dbogunowicz commented 9 months ago

@dbarbuzzi I understand your concerns. Let me take a look at what are HF default settings.

dbogunowicz commented 9 months ago

@dbarbuzzi https://github.com/huggingface/transformers/blob/main/src/transformers/pipelines/question_answering.py#L408 It's pretty explicit here.

dbarbuzzi commented 9 months ago

It's pretty explicit here.

Yes, however, if you look a couple of lines up at 405-406, it will automatically determine an appropriate value for doc_stride if it is left as None (the default). We don't have that same logic, nor opportunity, because we are defaulting it to a predetermined value (62) instead of None.

Either way, this is less about the bug and more about if the UX should be improved since we have a predetermined hard-coded default value for doc_stride and that's where we’re differing; transformers does not have a default value and, if not set, will automatically determine an appropriate one.

dbogunowicz commented 9 months ago

@dbarbuzzi let's set doc_stride to None by default and compute it dynamically to avoid the issue. I think this is fine.