Open greghendershott opened 9 years ago
mm-add-classes also lets you specify functions, instead of regexp, for the front/back parameters.
As you mentioned, the block reader needs to be aware of the language context inside the at-exp to properly handle terminators hiding in comment blocks. The elisp function parse-partial-sexp handles comments and looks like it could work:
"The syntax table controls the interpretation of characters, so these functions can be used for Lisp expressions when in Lisp mode and for C expressions when in C mode. " (ftp://ftp.gnu.org/old-gnu/Manuals/elisp-manual-20-2.5/html_node/elisp_566.html)
At-exp could possibly be extended to include optional meta info describing the block contents. @C:foo{ ... }
Thanks for thinking about this more and following up.
mm-add-classes also lets you specify functions, instead of regexp, for the front/back parameters.
I noticed that, I just doubt an independent back
matcher function could work reliably. I think it would need to be the "handler" option they mention, that parses the whole thing, because of languages that use }
.
As you mentioned, the block reader needs to be aware of the language context inside the at-exp to properly handle terminators hiding in comment blocks. The elisp function parse-partial-sexp handles comments and looks like it could work:
"The syntax table controls the interpretation of characters, so these functions can be used for Lisp expressions when in Lisp mode and for C expressions when in C mode. " (ftp://ftp.gnu.org/old-gnu/Manuals/elisp-manual-20-2.5/html_node/elisp_566.html)
Yes -- good point! Once we know the major mode, we can use its syntax table to ignore {
and }
chars within both comments and strings. Good.
Of course, that requires knowing which major mode...
At-exp could possibly be extended to include optional meta info describing the block contents. @C:foo{ ... }
Yes, something like that is the only idea I have right now.
It would probably be more helpful for the "language tag" to be the name of the Emacs mode, e.g. c-mode
instead of C
. That might make it simpler for us to make one mmm "class" extension that can handle all modes, since the mode name is right there.
I noticed that, I just doubt an independent back matcher function could work reliably. I think it would need to be the "handler" option they mention, that parses the whole thing, because of languages that use }.
Hmm, it seems to work ok. It handles the nested if/else block @ 1 and the commented groups of } } } @ 2 but something odd is happening at the end of the "set-wrap" block near the top of the img.
http://i.imgur.com/0lAQY2L.png
Keep in mind this is tested on the overloaded racket reader that converts top-level { } blocks to @begin/text{... \n}
;; .emacs: mmm-mode
(require 'mmm-mode)
(setq mmm-global-mode 'maybe)
;; uninteresting bit
(defun with-mode (new-mode f)
(let ((reset
(buffer-local-value 'major-mode (current-buffer))))
(funcall new-mode)
(funcall f)
(funcall reset)))
;; hack to import c-mode-syntax-table - what is the right way?
(with-mode 'c-mode (lambda ()))
;; relevant part
(defun jmp:inner->out (stx-table limit)
(setq parse-sexp-ignore-comments nil) ; needed?
(with-syntax-table stx-table
(parse-partial-sexp (point) limit -1
nil nil nil)))
(mmm-add-classes
'((cspl15
:submode c-mode
:face mmm-declaration-submode-face
:front "{"
:back (lambda (limit)
(jmp:inner->out c-mode-syntax-table limit)
;; set match data
(looking-at "")))))
(mmm-add-mode-ext-class 'racket-mode ".\\.cc\\.rkt" 'cspl15)
Now that racket-hash-lang-mode
is merged: In that case we defer to the language for indent.
Now another approach is possible. In something like:
#lang scribble/manual
@codeblock{
#lang rhombus
fun fib (n):
cond
| n == 0: 1
| n == 1: 0
| ~else: fib(n-1) + fib(n-2)
}
You could imagine that the drracket:indentation
for scribble/manual
could defer to that for rhombus
within the codeblock
.
Currently that doesn't happen. This is just a hand-wavy idea for a possible direction.
(And if that could/did work, you could also imagine the at-exp
meta language could do similar. Maybe that's more complicated because meta language.)
Question from
nha_
on#racket
: How to avoid racket-mode indentation messing up at-expressions where the content isn't Lisp code or plain text. For example C code.Below are some quick notes. At the moment this is more like a wiki entry and brain dump, than an actionable issue.
It seems the best way to deal with this is to use mmm-mode. It allows a buffer to use other major modes for certain regions. (There are other Emacs modes that offer to do this, but mmm-mode is currently maintained and on MELPA.)
For example in this file, first
M-x mmm-mode
. Then select the region inside the curly brackets of the at-expression itself:Then C-c % C-r, type
c-mode
and RET.Now that region will be managed by
c-mode
for indentation, font-lock, and so on.Is it tedious to mark these regions manually, every time you edit a buffer? Of course. mmm-mode supports defining classes that can look for regions in a buffer that should use another mode, and do this automatically. It provides some predefined such classes like for JS within HTML. Of course it provides no predefined class for "
c-mode
within at-expressions".How to make one? The "easy" class definitions use a pair of regexps for the begin and end tags. I don't think that will work for matching at-expressions. Example why: The closing regexp couldn't be
}
, because that could be a brace in the C code, not the one closing the at-expression. Instead, I think such a class would need to use thehandler
option that takes full control of the search. Such a handler could probably use an Emacs regexp like"@[^ {]+{\\(.\\|\n\\)+}"
. Because the middle portion,\\(.\\|\n\\)+
, is greedy, it matches through all{
and}
pairs within the C code itself, up to but not including the}
closing the at-expr. However it's fragile and would break if a curly brace were within a C comment, for example.Maybe racket-mode could provide a
make-mmm-mode-at-expression-class
function, that defines a mmm-mode class for a specific major mode inside an at-expression.Great, but which major mode to be used? At-expresions aren't "tagged" with some ID about the contents.
At best, maybe some file-local variable could say which mode to use for at-expressions. That would be OK for the case where it's the same mode for the entire file. But mixing more than one sub-mode in the same file... I don't know.
And that's where I'm leaving this for now.