Accessing the search from within Emacs

greghendershott / racket-mode

Emacs major and minor modes for Racket: edit, REPL, check-syntax, debug, profile, packages, and more.

https://www.racket-mode.com/

GNU General Public License v3.0

681 stars 93 forks source link

Accessing the search from within Emacs #507

Closed yilinwei closed 3 years ago

yilinwei commented 3 years ago

Hi,

A few months ago I asked on the Racket slack channel whether it would be possible to access the search via Emacs. I recently had a few spare cycles and am happy to report that it is possible by reusing the index generated by scribble.

I've put together a raco command which does this here, which currently does a search using the Levenshtein-distance between the text.

Would it be possible to include this functionality inside racket-mode? I'm completely happy to maintain the ~command-line tool~ package even if it is included with racket-mode; I'll defer to your judgement as to how best to package it up.

The output of something like raco search set! is shown below:

("set!" ("srfi/17") "/nix/store/jb9n4mmqzas62ln6vbj2b5w60cl54nn7-racket-7.8/share/doc/racket/srfi/srfi-std/srfi-17.html#set!")
("set!" ("plai/mutator") "/nix/store/jb9n4mmqzas62ln6vbj2b5w60cl54nn7-racket-7.8/share/doc/racket/plai/mutator.html#(form._((lib._plai/mutator..rkt)._set!))")
("set!" ("plai/gc2/mutator") "/nix/store/jb9n4mmqzas62ln6vbj2b5w60cl54nn7-racket-7.8/share/doc/racket/plai/gc2-mutator.html#(form._((lib._plai/gc2/mutator..rkt)._set!))")

If this is agreeable to you, I can knock-up a frontend in Elisp as well.

Many thanks,

Yilin

EDIT.

Just to be clear - my intention is not to invoke the raco command in racket-mode, but go through the same protocol as is currently being used. It's currently wrapped up in raco because it was the quickest way I could get something useable. Ideally I'd like there to be 2 links - one to the manual and the other to the output of documentation-at-point.

I have not had time to look through racket-mode yet to see how it would slot in.

yilinwei commented 3 years ago

This is the current function I use to invoke the command.

(defun racket-search (text)
  (interactive "M")
  (let*
      ((buf (with-current-buffer (get-buffer-create "*racket-search*")
          (erase-buffer)
          (current-buffer)))
       (filter (lambda (_ str)
         (let
             ((es
               (--map
            (with-temp-buffer
              (insert it)
              (goto-char (point-min))
              (read (current-buffer)))
            (--filter
             (not (s-blank? it))
             (s-split "\n" str)))))
           (cl-loop
            for e in es
            do (pcase e
             (`(,txt ,mods ,link)
              (widget-insert txt)
              (insert "\t")
              (widget-insert (s-join ", " mods))
              (insert "\t")
              (widget-create 'push-button
                     :notify (lambda (&rest ignore)
                           (browse-url
                            (s-prepend
                             "file://"
                             link)))
                     "Documentation")
              (newline)))))))
       (proc (make-process
         :name "raco"
         :buffer buf
         :sentinel #'ignore
         :command (list "raco" "search" text)
         :filter filter)))
    (set-process-filter proc filter)
    (switch-to-buffer buf)))

greghendershott commented 3 years ago

I notice you're parsing the plt-index.js file as javascript.

Although there's not necessarily anything wrong with that, I think you could get the data more directly:

#lang racket/base

(require setup/xref
         scribble/xref)

(define xref (load-collections-xref))

(define index (xref-index xref)) ;(listof entry?)

(length index) ; ~= 30,000, similar to plt-index.js

(define (entry->entry+path+anchor e)
  (define-values (path anchor) (xref-tag->path+anchor xref (entry-tag e)))
  (list (entry-words e)
        ;;(entry-content e)
        ;;(entry-tag e)
        (entry-desc e)
        path
        anchor))

;; A couple examples

(entry->entry+path+anchor (list-ref index 10))
;; (list
;;  '("random-source-randomize!")
;;  (procedure-index-desc 'random-source-randomize! '(srfi/27))
;;  #<path:/home/greg/racket/7.8-cs/doc/srfi/srfi-std/srfi-27.html>
;;  "random-source-randomize!")

(entry->entry+path+anchor (list-ref index 1000))
;; (list
;;  '("define-higher-order-primitive")
;;  (form-index-desc 'define-higher-order-primitive '(lang/prim))
;;  #<path:/home/greg/racket/7.8-cs/doc/htdp/index.html>
;;  "(form._((lib._lang/prim..rkt)._define-higher-order-primitive))")

Disclaimers:

I'm not sure that's 100% correct or complete. For example there are two plt-index.js files, for installation and user scopes. I didn't verify the above code covers both.
Although I'd guess that's faster than reading/parsing plt-index.js, I haven't confirmed.

Also, the Racket Mode back end is already doing a load-collections-xref for other purposes. That is fairly slow to do (e.g. seconds), the first time, but then it is cached. So reusing that would probably be good.

yilinwei commented 3 years ago

Thank you!

I'll certainly investigate further - I was not aware of the xref definitions (I walked backwards from the scribble HTML pages) so that was very useful. I will implement a basic search using this approach and update the issue once I've finished.

greghendershott commented 3 years ago

So, I made a quick comment about how to implement this.

But I should back up and ask: What is "this"? I'm not sure what you're proposing to implement. How would this look and work, for the end user? Could you say a little more about this -- what you have in mind?

I couldn't find the old Slack discussion you mentioned. Maybe it already scrolled off the free 10K messages limit, there.

My guess what you have in mind:

Racket Mode already has a racket-documentation-search command. It prompts you for a term, and goes to the Racket "Search Manuals" page in an external web browser. This was added a couple months ago in #462.
Instead (or in addition?), you would like to make an Emacs buffer where the search results and links are displayed. Correct?
If the user selects an item ("clicks a link"), what happens? Does it then go to the external web browser, or, does it use eww to view a web page in Emacs, or something else...?

yilinwei commented 3 years ago

Apologies - let me explain my workflow in-depth and how I envisage it to be used/how I use it currently.

I’ve found that when working with new libraries in Racket I’ll often forget the name of certain functions, forms or arguments — but have a vague idea of what it’s called or should be called; is it set-member? or member-set?? I usually solve this by either by trying to get auto-complete to complete to the identifier and then doing a racket-describe to verify the form or arguments, or by using racket-documentation-search and then tabbing through the alternatives.

Generally, both approaches aren’t ideal — ideally I wouldn’t need to leave the editor at all. That’s not to say I want to replicate the manual inside of Emacs — just this simple case is common enough for me to want to optimise it.

The workflow I envisage is the user running a search term (similar to how racket-xp-documentation works) and for a new buffer to popup with the same information that is displayed in the manual search bar.

Each of the entries ought to have a link to the associated output for racket-describe (or maybe the whole line is the link). I can then navigate to the entry that I care about and check the documentation and source without leaving Emacs. If the documentation is incomplete or I want the surrounding context, I can then go through to the manual from the *Racket describe* page.

Later I was hoping to add some more complex search queries, such as search from particular modules or config to exclude certain modules from the search results (I'm not going to want results from the teachpacks, for example).

Does this make the proposed change clearer?

greghendershott commented 3 years ago

Ah I see. That makes a lot of sense. Good idea!

At this point, if I were implementing this, I might consider about some possibilities. You don't need to. I would... but I might also over-think it and take too long. :smile: Having said that, here's my quick list:

For using the doc index in Racket code, I suggested using scribble/xref would be more direct than reading/parsing the plt-index.js files.
For using the doc index in Emacs, I wonder if something like your original approach might make sense after all. Not sure. Just wondering. Because...
Emacs has various "front ends" and approaches for searching. (For example, although I know about helm and ivy, I'm still happy using the built-in ido -- but I enhance with a couple packages like the ido-flex package. That allows "fuzzy matching" in searching.) Users might be accustomed to these and have various preferences. So I wonder if searching is something you maybe shouldn't try to do in the Racket back end, but instead defer to however the user likes it done in Emacs? Not sure. Just pointing it out.
Having said that, I don't know if any of those ready-made Emacs search solutions are suitable for lists on the order of 30,000 items!

Like I said, if you feel this isn't helpful, please ignore!

To be clear: I like your feature idea. I'm definitely not trying to discourage you from doing it, at all! There are various ways to implement it; even if some ways might be better, any is better than nothing at all.

yilinwei commented 3 years ago

Hi,

Thanks for the reply — your thoughts were useful. I realised that I was biasing some of my solutions due to the fact I’ve been working with some LSP servers recently, and the UI design of the scribble manual.

I’ve already switched to using scribble/xref and added the corresponding command in the Racket server.

I think I like your idea reusing the Emacs completing-read frontends. Apart from those that you’ve mentioned:

I don’t believe Emacs will struggle with the number of elements — I have ~25,000 symbols using a completing-read for Emacs functions and it doesn’t bat an eyelid.
The index is largely static which means there’s no real reason not to have it inside of Emacs.
It means greater configurability for Emacs users through a well-defined interface.

I will code it up and dog-food it for a few days and update the thread.

yilinwei commented 3 years ago

Hi,

Having used it for a few days it works quite well. There are several minor annoyances which changed the implementation somewhat.

I decided not to go through the command server:

The resultant index is big enough that actually parsing the s-expr takes a significant amount of time. I have a fairly good laptop and it still took ~10 seconds or so.
I decided to start pre-processing the entries as they came into the buffer — adding streaming into the cmd server protocol seemed overkill.

The entrypoints are now racket-search-index which creates the index and racket-search which uses completing-read to search through it.

There are some outstanding questions:

I don't really know whether we ought to start the indexing process by default.
I haven't yet implemented the visit-thunk.
I haven't tried it with ido or helm
I don't know whether it should have a default shortcut

greghendershott commented 3 years ago

Hi! It looks like you've been working a lot on this. I haven't yet caught up or replied. I got busy the last 2 weeks with the Thanksgiving holiday plus needing to handle some other things.

I'll try to catch up within the next few days. Maybe I'll have some feedback from trying it.

Meanwhile a few quick comments/answers:

I decided not to go through the command server:

The resultant index is big enough that actually parsing the s-expr takes a significant amount of time. I have a fairly good laptop and it still took ~10 seconds or so.

I decided to start pre-processing the entries as they came into the buffer — adding streaming into the cmd server protocol seemed overkill.

Sounds good.

I don't really know whether we ought to start the indexing process by default.

Good question. I use projectile. Whatever its default for this is, seems to work well for me. Maybe worth looking at that for ideas?

I haven't yet implemented the visit-thunk.

That should be easy, I can help if you need.

I haven't tried it with ido or helm

I honestly don't know if they'll just work automatically (e.g. they replace or advise completing-read) or if we (or end users) would need to add a little configuration glue.

I don't know whether it should have a default shortcut

I think that would be good.