openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.57k stars 5.53k forks source link

Local path resolution #304

Closed Noah-Huppert closed 2 years ago

Noah-Huppert commented 2 years ago

There was a regression in the code base where absolute paths are stored in the training metadata and tokenizer index files. This meant that these files couldn't be sent to others and used on their system. This PR introduced the LocalPath wrapper class which allows access to file paths relative to the project root and as an absolute path. The code was updated to use this class, and to never save an absolute path to one of the aforementioned JSON files. Additionally some logic was added to the LocalPath class to detect the absolute paths from other systems and to a best effort to resolve them on the new person's machine.

Noah-Huppert commented 2 years ago

Shoot this was meant to go into my fork sorry!