Closed gmmyung closed 3 months ago
Are U+2013 and U+201C the only non-ascii characters, or are there others? Not sure how I can reliably find them all in one go
EDIT: seems like /[^\t-~]
works ok
– “ ” ’ These are all I have found. I only looked in base.py, so there can be more in other files I used /[^\u0000-\u007F] command in vim to find this.
There is no minimal example to reliably reproduce the bug, but here are related issues: https://github.com/pytorch/pytorch/issues/124960
https://github.com/pytorch/tensordict/blob/484a0456fa210a091f8063784f76179d652871db/tensordict/base.py#L532 https://github.com/pytorch/tensordict/blob/484a0456fa210a091f8063784f76179d652871db/tensordict/base.py#L2614
The codebase contains many non-ascii characters such as U+2013 (–), U+201C (“)that causes pytorch to panic while running torch.compile.