earwig / bitshift

A semantic search engine for source code
https://bitshift.benkurtovic.com/
MIT License
30 stars 5 forks source link

Ruby parser bugs #80

Open earwig opened 10 years ago

earwig commented 10 years ago

Got this:

14-07-06 01:47:15 DEBUG bitshift.crawler.indexer.GitIndexer Indexing file: ruby/ruby: test/ruby/test_primitive.rb
14-07-06 01:47:15 ERROR bitshift.crawler.indexer.GitIndexer Exception raised while parsing:
Traceback (most recent call last):
  File "bitshift/crawler/indexer.py", line 169, in _insert_repository_codelets
    parse(codelet)
  File "bitshift/parser/__init__.py", line 83, in parse
    symbols = PARSERS[lang_string](codelet)
  File "bitshift/parser/__init__.py", line 51, in parse_via_proc
    symbols = json.loads(data)
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
earwig commented 10 years ago

Also, a bunch of ruby processes were left around running after the indexer moved on to the next repo. Not sure what was going on there. I eventually started to get this, which I think is a consequence, and it was basically triggering for more than half of the files at this point:

14-07-06 01:49:20 DEBUG bitshift.crawler.indexer.GitIndexer Indexing file: ruby/ruby: test/rake/test_rake_package_task.rb
14-07-06 01:49:20 ERROR bitshift.crawler.indexer.GitIndexer Exception raised while parsing:
Traceback (most recent call last):
  File "bitshift/crawler/indexer.py", line 169, in _insert_repository_codelets
    parse(codelet)
  File "bitshift/parser/__init__.py", line 83, in parse
    symbols = PARSERS[lang_string](codelet)
  File "bitshift/parser/__init__.py", line 48, in parse_via_proc
    stdin=subprocess.PIPE, stdout=subprocess.PIPE)
  File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1223, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

Also, this unicode issue:

14-07-06 01:47:38 DEBUG bitshift.crawler.indexer.GitIndexer Indexing file: ruby/ruby: ext/tk/sample/demos-jp/icon.rb
14-07-06 01:47:38 ERROR bitshift.crawler.indexer.GitIndexer Exception raised while parsing:
Traceback (most recent call last):
  File "bitshift/crawler/indexer.py", line 169, in _insert_repository_codelets
    parse(codelet)
  File "bitshift/parser/__init__.py", line 83, in parse
    symbols = PARSERS[lang_string](codelet)
  File "bitshift/parser/__init__.py", line 50, in parse_via_proc
    data = proc.communicate(codelet.code)[0]
  File "/usr/lib/python2.7/subprocess.py", line 799, in communicate
    return self._communicate(input)
  File "/usr/lib/python2.7/subprocess.py", line 1401, in _communicate
    stdout, stderr = self._communicate_with_poll(input)
  File "/usr/lib/python2.7/subprocess.py", line 1465, in _communicate_with_poll
    input_offset += os.write(fd, chunk)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 96-105: ordinal not in range(128)