brucehsu / GobiesVM

A Ruby VM written in Go aims to exploit parallelism via Software Transactional Memory
MIT License
184 stars 8 forks source link

Poor parsing performance #11

Open brucehsu opened 10 years ago

brucehsu commented 10 years ago

Example:

Thread.new do
    lines = IO.readlines '../words_1.txt'
    cnt = {'cat'=>0, 'dog'=>0, 'cow'=>0, 'tiger'=>0, 'lion'=>0, 'wolf'=>0, 'fish'=>0}
    lines.each do |l|
        words = l.split(' ')
        words.each do |word|
            cnt[word] = cnt[word] + 1
        end
    end
    p cnt
end

which is significantly slower than following example:

lines = IO.readlines '../words_1.txt'
cnt = {'cat'=>0, 'dog'=>0, 'cow'=>0, 'tiger'=>0, 'lion'=>0, 'wolf'=>0, 'fish'=>0}
lines.each do |l|
    words = l.split(' ')
    words.each do |word|
        cnt[word] = cnt[word] + 1
    end
end
p cnt
brucehsu commented 10 years ago

It seems like the parser performs poorly on nested blocks. Initial investigation shows it restarts from the last successful syntax rule upon failed rule.

brucehsu commented 10 years ago

Might be worthy to investigate goyacc and new lexer from sourcegraph. https://sourcegraph.com/blog/multi-language-lexer-and-scanner-for-go