Closed craymichael closed 3 weeks ago
This pull request was exported from Phabricator. Differential Revision: D64998804
This pull request was exported from Phabricator. Differential Revision: D64998804
This pull request was exported from Phabricator. Differential Revision: D64998804
This pull request was exported from Phabricator. Differential Revision: D64998804
This pull request has been merged in pytorch/captum@ad89e0bb2bb4fa061d576ff3bfc387aa3ea5939d.
Summary: Use the call method of tokenizers that returns a BatchEncoding with offsets. This allows us to grab text from the fully decoded string and not make assumptions about how many tokens correspond to a single string.
Differential Revision: D64998804