pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.49k stars 815 forks source link

Optionally ignore utf-8 decoding error for scripted C++ tokenizers. (… #2134

Closed Nayef211 closed 1 year ago

Nayef211 commented 1 year ago

…#2128)

Summary: Pull Request resolved: https://github.com/pytorch/text/pull/2128

Binding and test to make sure we can use 'ignore' option for utf-8 decoding added to pytorch in D43970697( https://github.com/pytorch/pytorch/pull/97282).

Reviewed By: Nayef211

Differential Revision: D44315169

fbshipit-source-id: d42fcacafd429cf586c631faf826abc172b173d3

codecov[bot] commented 1 year ago

Codecov Report

Merging #2134 (3b687ae) into main (f151b4c) will not change coverage. The diff coverage is n/a.

:exclamation: Current head 3b687ae differs from pull request most recent head ff1aa87. Consider uploading reports for the commit ff1aa87 to get more accurate results

@@     Coverage Diff      @@
##   main   #2134   +/-   ##
============================
============================

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

github-advanced-security[bot] commented 1 year ago

You have successfully added a new CodeQL configuration .github/workflows/codeql.yml:build. As part of the setup process, we have scanned this repository and found 3 existing alerts. Please check the repository Security tab to see all alerts.