Closed wenbingl closed 2 months ago
I wonder if even for unit tests here we can extract test data JSON files from HF to avoid adding the large files to our repo? Since we'll only be running the tests when we make changes to the C API and end users won't need to run them, the time taken to download them at runtime should be fine.
Downloading HF data from C native tests will add extra code dependency on test. Unless the tokenizer data size become much larger than current ones, we may not need to worry about it now.
I wonder if even for unit tests here we can extract test data JSON files from HF to avoid adding the large files to our repo? Since we'll only be running the tests when we make changes to the C API and end users won't need to run them, the time taken to download them at runtime should be fine.