tmm1 / ripper-tags

fast, accurate ctags generator for ruby source code using Ripper
MIT License
550 stars 43 forks source link

Make ripper-tags faster? #90

Closed bibstha closed 4 years ago

bibstha commented 5 years ago

Hi,

We have a large codebase and it takes ripper tags around 25 minutes to run through the whole codebase. Is there a way to make it faster?

Significant portion of the code doesn't change so when I re-run it, it's mostly duplicating the task.

Some thoughts:

  1. Can ripper-tags be run on jruby - parallel and merge the output?
  2. Is there a way to run only on updates files from the last time it was run?

I run it as

git ls-files | \
  ripper-tags --tag-relative -L - -f"$dir/$$.tags" --exclude="*.js" --exclude="*.sql"
mislav commented 5 years ago

I'm sorry that ripper-tags is slow for you!

  1. Can ripper-tags be run on jruby - parallel and merge the output?

You could definitely manually run ripper-tags in parallel processes which would utilize multiple CPU cores. You would have to split the git ls-files list in N pieces and then start N ripper-tags processes that output to separate files. Finally, you would have to merge the separate tags files into one and sort it. No jruby is needed for this.

2. Is there a way to run only on updates files from the last time it was run?

No, but I suspect it could be done like so with a script:

  1. get the modification time of the existing tags file,
  2. select all files in git ls-files that are newer than the mtime,
  3. run ripper-tags against only those files and save it as tags.new,
  4. delete any mention of those modified files from existing tags file
  5. merge tags and tags.new into a single file.

ripper-tags right now doesn't have any tools for regenerating and merging tags file, but it could be useful to have the regeneration functionality built-in. Until then, you'll have to resort to your own scripts.

P.S. you don't need --exclude="*.js" --exclude="*.sql" because ripper-tags only ever processes *.rb files.

mislav commented 4 years ago

2. Is there a way to run only on updates files from the last time it was run?

Now there is a way: call ripper-tags --append <files...> (where <files> is the list of updated files) and it will append the existing tags file with new tags from updated files. https://github.com/tmm1/ripper-tags/pull/96