xolox / vim-easytags

Automated tag file generation and syntax highlighting of tags in Vim
http://peterodding.com/code/vim/easytags/
1.01k stars 109 forks source link

Sorting issue #25

Closed jreybert closed 12 years ago

jreybert commented 12 years ago

I am facing a big issue with sort in tags files... The sort is not the same if I create the file from command line and if I use UpdateTags in easytags. The problem is how underscores are interpreted. I don't know if it is always reproducible, maybe it is because some deep hidden locale definition somewhere. I tried to play with LC_COLLATE, but I don't know if is understood by vim. BTW, here a comment extracted from help of sort command in vim (I know you use sort function, but it is maybe related)

The details about sorting depend on the library function used. There is no guarantee that sorting is "stable" or obeys the current locale. You will have to try it out.

I am using:

The proble is that vim says to me that some tags files are not well sorted (and I want to keep binary search!)

I can reproduce it with the following:

create a test.c file:

int test_pe_init_params;
int test_pe_try_credits;    

void test_pending_credits_init();
void test_pending_try_credits();

Now, create a tags file from command line:

ctags --sort=foldcase -R --c++-kinds=+p --c-kinds=+p --fields=+liaS

Copy this first version, then open test.c, :UpdateTags, then check diff between two versions:

exuberant-ctags version

!_TAG_FILE_FORMAT   2   /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED   2   /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_PROGRAM_AUTHOR    Darren Hiebert  /dhiebert@users.sourceforge.net/
!_TAG_PROGRAM_NAME  Exuberant Ctags //
!_TAG_PROGRAM_URL   http://ctags.sourceforge.net    /official site/
!_TAG_PROGRAM_VERSION   5.8 //
test_pending_credits_init   test.c  /^void test_pending_credits_init();$/;" p   language:C  file:
test_pending_try_credits    test.c  /^void test_pending_try_credits();$/;"  p   language:C  file:
test_pe_init_params test.c  /^int test_pe_init_params;$/;"  v   language:C
test_pe_try_credits test.c  /^int test_pe_try_credits;  $/;"    v   language:C
!_TAG_FILE_FORMAT   2   /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED   2   /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_PROGRAM_AUTHOR    Darren Hiebert  /dhiebert@users.sourceforge.net/
!_TAG_PROGRAM_NAME  Exuberant Ctags //
!_TAG_PROGRAM_URL   http://ctags.sourceforge.net    /official site/
!_TAG_PROGRAM_VERSION   5.8 //
test_pe_init_params /nfs/home/jreybert/test/easytags/test.c /^int test_pe_init_params;$/;"  v   language:C
test_pe_try_credits /nfs/home/jreybert/test/easytags/test.c /^int test_pe_try_credits;  $/;"    v   language:C
test_pending_credits_init   /nfs/home/jreybert/test/easytags/test.c /^void test_pending_credits_init();$/;" p   language:C  file:
test_pending_try_credits    /nfs/home/jreybert/test/easytags/test.c /^void test_pending_try_credits();$/;"  p   language:C  file:
xolox commented 12 years ago

Hi and thanks for the feedback. I've known about this problem for quite a while, I've even run into it myself now and then, except this happened with very large tags files which made it hard to diagnose, so thanks for the small reproducible example. The problem is (obviously) that the sorting done by my Vim script code (I guess sort() and :sort use the same implementation) is incompatible with the sorting used by Exuberant Ctags. I'm at work right now, but I'll see if I have time to find a solution for this in the weekend.

xolox commented 12 years ago

The Vim documentation contains the following note under the documentation for the tagbsearch option:

Note that case must be folded to uppercase for this to work.

I created several tags files based on the input highlight.py (from the vim-easytags repository) for comparison:

ctags --sort=no -f- --language-force=python highlight.py > no-sorting
ctags --sort=yes -f- --language-force=python highlight.py > simple-sorting
ctags --sort=foldcase -f- --language-force=python highlight.py > foldcase-sorting

I also saved a tags file using the easytags plug-in with the following patch applied to autoload/xolox/easytags.vim:

@@ -489,7 +489,7 @@ function! xolox#easytags#write_tagsfile(tagsfile, headers, entries) " {{{2
   if sort_order == 1
     call sort(a:entries)
   else
-    call sort(a:entries, 1)
+    call sort(a:entries, function('s:foldcase_cmp'))
   endif
   let lines = []
   if xolox#misc#os#is_win()
@@ -512,6 +512,12 @@ function! s:join_entry(value)
   return type(a:value) == type([]) ? join(a:value, "\t") : a:value
 endfunction

+function! s:foldcase_cmp(a, b)
+  let a = toupper(a:a)
+  let b = toupper(a:b)
+  return a == b ? 0 : a > b ? 1 : -1
+endfunction
+
 function! xolox#easytags#file_has_tags(filename) " {{{2
   " Check whether the given source file occurs in one of the tags files known
   " to Vim. This function might not always give the right answer because of

Even with this patch applied the sorting is not correct. It seems that the plug-in sorts underscores before alphanumeric characters while Exuberant Ctags (and sort -f) sort underscores after alphanumeric characters.

xolox commented 12 years ago

Even with this patch applied the sorting is not correct. It seems that the plug-in sorts underscores before alphanumeric characters while Exuberant Ctags (and sort -f) sort underscores after alphanumeric characters.

I found the reason why the sorting still wasn't correct, the comparison should be a == b ? 0 : a ># b ? 1 : -1 for case sensitive comparison (after folding to uppercase). I'll commit the patch so you can confirm whether it resolves the issue.

jreybert commented 12 years ago

The fix solve the issue, great! This patch came very quiclky, thanks!