Open markprzepiora opened 5 years ago
This looks awesome. Thank you for writing it so that it can be sped up. I just wish the gem author merged it now.
@markprzepiora, I'd like to explore options for storing the data.
What I ended up doing in a fork was converting the YAML to CSV and using the FastCSV gem to process the data more quickly and to prevent loading it into memory all at once; however I don't feel that it was necessarily the best method.
Another idea is to keep the YAML around for development purposes but bundle a SQLite database generated by a rake task.
Since you've opened this PR I wanted to get your feedback.
Library data like this really shouldn't be distributed via Marshal - the format of Marshal is not guaranteed to be stable between ruby versions (and indeed, is not in practice). I'd suggest either pursuing fast CSV options or just querying a sqlite file.
Loading 4MB of YAML data takes about 1 second on my development machine. Replacing the
YAML
-dumped data withMarshal
-dumped data instead brings this down to about 0.08 seconds. This makes a big difference in feedback speed especially when running individual tests that rely onZipCode.identify
.