Closed shivanraptor closed 1 year ago
Hi, by default .read_chat(url)
ultimately calls pylangacq.Reader.from_zip
, which caches the downloaded data to ~/.pylangacq/
and loads from the cached data if found. Was this not the behavior you saw on your end?
Oh my bad, I didn't notice the cached folder.
Feature you are interested in and your specific question(s): While using
.read_chat(url)
, the ZIP file is downloaded, extracted and parsed every time the function is executed. Execution time and download time can be saved by caching the files in a local folder like~/.cache/pycantonese/chatdata/
, just like HuggingFace's.from_pretrained(model)
anddatasets.load_dataset()
(and many other similar functions).What you are trying to accomplish with this feature or functionality: Decrease execution time, Increase performance.
Additional context: