SublimeText / CTags

CTags support for Sublime Text
MIT License
989 stars 166 forks source link

Large tag files cause error when sorting #145

Open rednecknguyen opened 11 years ago

rednecknguyen commented 11 years ago

I'm on Mac OS X Lion. I've changed out /usr/bin/ctags with a proper ctags implementation. I'm new to Sublime Text and to CTags. I've never gotten CTags to work properly yet.

I'd love to get this working. Please help.

When attempting to build ctags, I'm getting the following error on every project I have (regardless of source tree - username removed):

Exception in thread Thread-5:
Traceback (most recent call last):
  File "X/threading.py", line 639, in _bootstrap_inner
  File "X/threading.py", line 596, in run
  File "ctagsplugin in /Users/[user]/Library/Application Support/Sublime Text 3/Installed Packages/CTags.sublime-package", line 103, in run
  File "ctagsplugin in /Users/[user]/Library/Application Support/Sublime Text 3/Installed Packages/CTags.sublime-package", line 682, in build_ctags
  File "ctags in /Users/[user]/Library/Application Support/Sublime Text 3/Installed Packages/CTags.sublime-package", line 178, in build_ctags
  File "ctags in /Users/[user]/Library/Application Support/Sublime Text 3/Installed Packages/CTags.sublime-package", line 157, in resort_ctags
  File "X/encodings/ascii.py", line 26, in decode
UnicodeDecodeError: 'ascii' codec can't decode byte 0x96 in position 1003: ordinal not in range(128)
rednecknguyen commented 11 years ago

Just FYI, this is what I get when I attempt things on Linux:

Re/Building CTags for /home/[user]/development/comm2_boost_testing/src/Common/.tags: Please be patient
Traceback (most recent call last):
  File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
    raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.NavigateToDefinition object at 0x7f1ff86d0c10>)
Traceback (most recent call last):
  File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
    raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.NavigateToDefinition object at 0x7f1ff86d0c10>)
Traceback (most recent call last):
  File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
    raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.ShowSymbols object at 0x7f1ff86d0c50>)
Traceback (most recent call last):
  File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
    raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.ShowSymbols object at 0x7f1ff86d0c50>)
Traceback (most recent call last):
  File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
    raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.ShowSymbols object at 0x7f1ff86d0c50>)
Traceback (most recent call last):
  File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
    raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.ShowSymbols object at 0x7f1ff86d0c50>)
Exception in thread Thread-3:
Traceback (most recent call last):
  File "X/threading.py", line 639, in _bootstrap_inner
  File "X/threading.py", line 596, in run
  File "ctagsplugin in /home/[user]/.config/sublime-text-3/Installed Packages/CTags.sublime-package", line 103, in run
  File "ctagsplugin in /home/[user]/.config/sublime-text-3/Installed Packages/CTags.sublime-package", line 682, in build_ctags
  File "ctags in /home/[user]/.config/sublime-text-3/Installed Packages/CTags.sublime-package", line 178, in build_ctags
  File "ctags in /home/[user]/.config/sublime-text-3/Installed Packages/CTags.sublime-package", line 157, in resort_ctags
  File "X/codecs.py", line 300, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 6202: invalid start byte
rednecknguyen commented 11 years ago

Well, after some investigating, it appears there is a memory issue or a buffer getting overloaded when the .tags file is large. In my case, for my source trees, the .tags file are greater than 25MB.

When the resort_tags function runs against this very large file, it runs into the error. There are no issues with the files except that they are large.

If instead of calling the resort tags function, I create a subprocess that runs a new file resort_tags.py file (which runs the resort_tags function), everything is fine.

Note: I manually installed CTags into the Packages folder. The folder is named CTags-Master as that's what was in the zip file I downloaded from this site.

As a quick and dirty workaround, in ctags.py in build_ctags(), instead of the call to resort_tags, I did the following.

_WARNING: I AM NOT A PYTHON PROGRAMMER IN ANY SENSE. I'M A C/C++ PROGRAMMER. REWRITE TO PYTHON STANDARD PROGRAMMING PRACTICES._

resort_path = sublime.packages_path() + '/CTags-master/' + 'resort_tags.py'
resort_path = '"%s"' % resort_path

cmd2 = 'python ' + resort_path +  ' ./.tags'

p2 = subprocess.Popen(cmd2, cwd = dirname(tag_file), shell=1, env=env, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

ret2 = p2.wait()
if ret2: raise EnvironmentError((cmd2, ret2, p2.stdout.read()))

I don't know if this is a python issue where the interpreter gets a hick-up or what. But this is regularly occurring with varying sizes of .tags files that are large, even when the source trees have unrelated code.

astyagun commented 10 years ago

+1 I have the same problem in Mac OS X Mountain Lion

stephenfin commented 10 years ago

@rednecknguyen @astyagun I'm looking into this issue. Would either of ye happen to have a sample .tags file that I could test again?

astyagun commented 10 years ago

http://yadi.sk/d/wmFrx6U0BsZoR

You can also recreate it by generating tags for Rails framework for example.

stephenfin commented 10 years ago

So I've looked into this. Problem seems to be because the sorting is taking place in memory - the built in Python interpreter in ST must have some enforced memory limit that this hits (hence why the solution @rednecknguyen proposed works - it spawns a new Python process outside of ST).

@rednecknguyen's solution (while good) isn't perfect though - it assumes that Python is installed in the system and basically ups the memory ceiling - the same issue could occur with a larger file again. I propose two possible solutions:

  1. Offload to sort in unix.
    • Pros: This would be far faster than anything we could achieve in Python.
    • Cons: While Windows does provide a sort utility it's very basic and won't let you sort on tabbed columns (as found in tag files). Hence we'd need to provide an alternative here.
  2. Reimplement the sort algorithm to use external sorts for large files
    • Pros: Would work without any external requirements - it's pure Python after all.
    • Cons: External sorts are slow. Even if you used a hybrid "sometimes-internal-sometimes-external" solution, deciding when to use an external vs. internal sort would be tricky.

Opinions anyone?

davividal commented 10 years ago

What about reimplementing sort if running on windows?

http://stackoverflow.com/questions/1325581/how-do-i-check-if-im-running-on-windows-in-python

stephenfin commented 10 years ago

Yeah - I considered that alright (point 1). However, we'll still have the same issue (albeit only on Windows). It's a kind of half-way solution that will only fix things for some people and add to the maintenance overhead. Good idea though :)

stephenfin commented 10 years ago

@astyagun @rednecknguyen I've pushed some changes to a feature branch. Would either of ye mind checking out that branch and seeing if it fixes things? Ye can see the changes made in the commits, but essentially there are now three ways to sort files:

These can be configured by setting the value of sort in settings, i.e. to enable the bucket sort:

{
  "sort": 1
}
astyagun commented 10 years ago

I can't reproduce the problem anymore. Either it's fixed in the version from Package Control already or some change in my setup has fixed it.

stephenfin commented 10 years ago

Well that's good to hear. Hopefully it's the former. I'll wait to see if any other reports of the issue arise and if not can I guess we consider this issue closed