mattpodolak / pmaw

A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.
MIT License
213 stars 28 forks source link

parquet for caching responses #7

Open mattpodolak opened 3 years ago

mattpodolak commented 3 years ago

Use parquet instead of pickle for caching responses.

Need to assess if this is a reasonable improvement:

mattpodolak commented 3 years ago

This could be expanded by increasing the generality of the caching mechanism to identify the type of file used in the cache and load accordingly.

Saving responses to cache should be performed with a default type (pickle/parquet/feather/jay) - whichever is fastest, with optional config to select a caching function.