Data-Liberation-Front / csvlint.rb

The gem behind http://csvlint.io
MIT License
283 stars 86 forks source link

optimize validation with regular expression #270

Closed youpy closed 1 year ago

youpy commented 1 year ago

This PR improves the performance of regular expression-based validation. The changes in this PR reduced the time to validate a 10,000 line CSV file with 25 regular expression validations per line from 10 seconds to 1 second.

The results of the profiler run are as follows. Before the change, most of the time is spent validating with regular expressions.

before after

Changes proposed in this pull request:

Floppy commented 1 year ago

Wow! Instantiating regexes is expensive! Thanks :)