niklasfasching / go-org

Org mode parser with html & pretty printed org rendering. also shitty static site generator.
https://niklasfasching.github.io/go-org/
MIT License
353 stars 49 forks source link

Support for conf-mode #71

Open braoult opened 2 years ago

braoult commented 2 years ago

Emacs conf-mode is used for files like Windows INI, gitconfig, etc... the block type looks like "conf", such as :

#+BEGIN_SRC conf
[libdefaults]
    default_realm = LAN
[realms]
    LAN = {
        kdc = kdc1.lan
        kdc = kdc2.lan
        admin_server = kadmin.lan
    }
#+END_SRC

The display in Emacs can look like : Screenshot_2022-02-20_20-32-38

org's HTML export is similar. Source code looks like :

<pre class="src src-conf">[<span style="color: #6ae4b9;">libdefaults</span>]
        <span style="color: #00d3d0;">default_realm</span> = LAN
[<span style="color: #6ae4b9;">realms</span>]
        <span style="color: #00d3d0;">LAN</span> = {
                <span style="color: #00d3d0;">kdc</span> = kdc1.lan
                <span style="color: #00d3d0;">kdc</span> = kdc2.lan
                <span style="color: #00d3d0;">admin_server</span> = kadmin.lan
        }
</pre>

I believe corresponding chroma language could be "Docker" (tested on https://swapoff.org/chroma/playground/). I am not sure what chroma's INI files are, but it is something different. Output of same code on Chroma's test page : Screenshot_2022-02-20_20-40-56

Would it be possible to include conf blocks in go-org ?

niklasfasching commented 2 years ago

Not sure I understand the request correctly. IIUC you want support for conf highlighting in code blocks. If that's right, you'd have to talk to chroma / whatever highlighter you use with go-org as highlighting logic is not part of this repo

braoult commented 2 years ago

Not sure how it works here, but I know : 1) Conf files are supported in org-mode (with #+BEGIN_SRC conf blocks) 2) conf files highlighting are supported in chroma (output looks correct with Docker"language" - see screenshots in my initial question-, even if it is more a json format).

I thought that go-org uses chroma to highlight _SRC blocks, maybe something like : 1) go-org understands a SRC block is for language X, with a language keyword following BEGIN_SRC block. 2) it calls some chroma code for language X, with the contents of SRC block as input

EDIT (while double checking) : I think I am wrong ; Emacs (and org-mode) conf-mode is much more complicated than I thought ; it tries to auto-detect the (conf file) type, and supports many sub-types. conf-mode help says :

Mode for Unix and Windows Conf files and Java properties. Most conf files know only three kinds of constructs: parameter assignments optionally grouped into sections and comments. Yet there is a great range of variation in the exact syntax of conf files. See below for various wrapper commands that set up the details for some of the most widespread variants.

This mode sets up font locking, outline, imenu and it provides alignment support through ‘conf-align-assignments’. If strings come out wrong, try ‘conf-quote-normal’.

Some files allow continuation lines, either with a backslash at the end of line, or by indenting the next line (further). These constructs cannot currently be recognized.

Because of this great variety of nuances, which are often not even clearly specified, please don’t expect it to get every file quite right. Patches that clearly identify some special case, without breaking the general ones, are welcome.

If instead you start this mode with the generic ‘conf-mode’ command, it will parse the buffer. It will generally well identify the first four cases listed below. If the buffer doesn’t have enough contents to decide, this is identical to ‘conf-windows-mode’ on Windows, elsewhere to ‘conf-unix-mode’. See also ‘conf-space-mode’, ‘conf-colon-mode’, ‘conf-javaprop-mode’, ‘conf-ppd-mode’ and ‘conf-xdefaults-mode’.

chroma's "Docker" is only one of them, so it won't work out of the box when an org-file will have a conf src block. It means go-org would have to either try to detect the specific conf format, either accept some parameter to use one specific chroma renderer.

I am not sure what I write here makes any sense at all ; sorry if I totally misunderstood how go-org works.

niklasfasching commented 2 years ago

Thx for elaborating. As you said, go-org doesn't do any highlighting, it just calls chroma and passes the lang defined after BEGIN_SRC as well as the content https://github.com/niklasfasching/go-org/blob/master/org/html_writer.go#L22

I'm not sure there's anything to do inside this library - it's up to the highlighter. Either specify the language as docker or check with chroma if conf could be an alias for that / why conf is not highlighted as expected.

Providing a key to override the specified language would be possible but unless babel has Support for that I'm not sure we should really add that

braoult commented 2 years ago

In fact, babel has support for conf-mode, but, as there is a type auto-detect, conf-mode is able handle many different "formats". Examples :

The 2 blocks above show in emacs/org-mode as : Screenshot_2022-02-21_22-03-41

And the org HTML export looks like :

Screenshot_2022-02-21_22-06-46

Here, the second block is simply exported as <pre> block, while the first one is correctly highlighted. The org exported HTML is :

<div class="org-src-container">
<pre class="src src-conf">[<span style="color: #6ae4b9;">libdefaults</span>]
        <span style="color: #00d3d0;">default_realm</span> = LAN
[<span style="color: #6ae4b9;">realms</span>]
        <span style="color: #00d3d0;">LAN</span> = {
                <span style="color: #00d3d0;">kdc</span> = kdc1.lan
                <span style="color: #00d3d0;">kdc</span> = kdc2.lan
                <span style="color: #00d3d0;">admin_server</span> = kadmin.lan
        }
</pre>
</div>
<div class="org-src-container">
<pre class="src src-conf">foo         bar
line        2
third       line
</pre>
</div>

The difficulty is that a "conf" block in org could well correspond to multiple chroma "languages". In my 2 examples, only the first block is rendered "correctly" (even if technically speaking it should not) by chroma's "Docker" : Screenshot_2022-02-21_22-17-07

So, unless if go-org also tries also to "auto-detect" the different possible org-mode "conf blocks", I don't see how we could choose a suitable chroma renderer.

niklasfasching commented 2 years ago

Makes sense. In any case, I'd bring it up with chroma :). go-org isn't the right place to auto detect this imho

braoult commented 2 years ago

Whatever, as you said that go-org passes the src block language as-is to chroma, I will simply use the final chroma languages suitable for my different conf blocks, until a better auto-detection system works. This workaround will be good enough for my own needs ;-)

braoult commented 2 years ago

I understand (and agree) this question may be closed, however me must understand this is a kind of fork with org-mode : Some source blocks (here conf blocks) are incompatible between go-org/chroma and org-mode. It means some org src blocks written for go-org will not be understood by org-mode, and vice-versa.

I think we should keep a list of those src blocks incompatibilities somewhere, that we could extend when we notice some. Does it make sense ?

niklasfasching commented 2 years ago

I'd disagree with keeping such a list in this repo - what languages org babel supports is configurable anyways (e.g. org-babel-load-languages) -there's no fixed target. Even if there was a fixed list of languages for org babel, this library does not handle syntax highlighting - it merely provides a hook for it.

braoult commented 2 years ago

Well, org-babel conf src block is available in bare Emacs. Would you use org-edit-special (Emacs 27.1 and more) on a conf src block, the correct edit mode will be used depending on detected config syntax, as discussed in this thread.

When I said "keeping a list", I meant "keeping in a single place the knowledge of what can be done in some situations, when go-org/chroma does not render code as done in org-mode".

If we keep such a list, it could also also help in future (for chroma or go-org - or somewhere in-between ?) if some people try to make things easier to use for casual users. I don't think this kind of list could appear in org-mode/babel documentation (they don't know about chroma), nor in chroma (they don't know about Emacs/org-mode).

It could be something like :


There are some differences between org-mode/babel SRC blocks rendering/exporting and go-org/chroma rendering. To get better rendering in go-org for some babel languages, you may try to replace the org-mode block language as following :

org-mode SRC block language                 go-org

conf [Conf|Unix]                            Docker
...

Then, we would just have to add a line every time someone notices a difference.

But up to you, of course.

niklasfasching commented 2 years ago

I'm still not convinced this repo is the right place for such a list. As said, this is a difference between emacs and chroma. Keeping a list of differences between those two feels out of scope when go-org doesn't depend on chroma but rather uses it as an example implementation.

I'd be up for keeping this open and waiting for more use cases. If this is a common thing to run into, let's add it even if out of scope.

kaushalmodi commented 2 years ago

@niklasfasching Can go-org have it's config object that hugo passes on by parsing the user's hugo site config? [I know this will be a lot of work, but just putting my thought out here.]

I faced this problem with emacs major modes doing syntax highlighting, but Chroma not working with those exact LANG identifiers. So I came up with this for ox-hugo which works really well. ox-hugo simply translates the Emacs major mode name to something that Chroma understands.

I understand that it doesn't make sense to bake this translation table into go-org. But having a [go-org] namespace for doing these settings in config.toml would be useful.

braoult commented 2 years ago

@kaushalmodi, I am not convinced we should use a 3rd dependency: go-org sits between Emacs/org-mode and chroma. Do you mean we would need some hugo user config file to find-out a correct language matching ?

_Disclaimer: I don't know/use ox-hugo; what I just know is that org-mode uses some block language names which cannot be a bijection to a list of different languages : org-mode makes a special job from the block language (which is a generic name, like conf), to guess a precise syntax within the many handled by the src block language.

kaushalmodi commented 2 years ago

Do you mean we would need some hugo user config file to find-out a correct language matching

I meant that this library will need a map-like config object at input. For a Hugo user, that translation map will be in the Hugo config. On the Hugo side, the config will be converted to this object and passed on to go-org.

The map can be something like:

# toml
[go_org]

  [[go_org.src_lang_mapping]]
    org = "conf"
    chroma = "cfg"

  [[go_org.src_lang_mapping]]
    org = "ipython"
    chroma = "python"

or

# yaml
go_org:
  src_lang_mapping:
    - org: conf
      chroma: cfg
    - org: ipython
      chroma: python

Above is a rough idea. The exact structure of that object will need more thought.

braoult commented 2 years ago

I don't think the configuration file format will be the real issue here (xml, json, yaml, simple key-value, etc...), if we can find the mapping itself (how to map an unique org language like conf to different chroma languages).

Thinking again about an acceptable approach, I am just wondering if a kind of source block metadata would not be better : It would keep full org-mode compatibility, and would just be used by go-org/chroma when it exists. Maybe something like :

#+ATTR_CHROMA: :lang docker
#+BEGIN_SRC conf
[libdefaults]
    default_realm = LAN
[realms]
    LAN = {
        kdc = kdc1.lan
        kdc = kdc2.lan
        admin_server = kadmin.lan
    }
#+ENDIF

#+ATTR_CHROMA: :lang ApacheConf
#+BEGIN_SRC conf
# comment
foo         bar
line        2
third       line            
#+END_SRC

I think this would be an relatively "cheap" solution, and not disruptive : no change for current go-org users, no break in org/babel as I planned to do in https://github.com/niklasfasching/go-org/issues/71#issuecomment-1047669873. The idea would be: go-org passes the source block language to chroma as-is (or a chroma equivalence if we have a mapping table), except if the chroma attribute :lang is set (that value would be used instead).

niklasfasching commented 2 years ago

ATTR sounds good but won't work for inline source blocks like src_conf{foo bar} - we could instead use a param - e.g.

#+begin_src conf :chroma-lang ApacheConf
foo bar
#+end_src

src_conf[:chroma-lang ApacheConf]{foo bar}
braoult commented 2 years ago

ATTR sounds good but won't work for inline source blocks like src_conf{foo bar} - we could instead use a param - e.g.

Yes, it would be much better. I thought about it, but I was unsure about what is possible or not on params (I asked on #emacs on freenode, but did not get answer, this is why I discussed only the ATTR option).

If there is no drawback, a param would be clearly the best (and simple) option...

kaushalmodi commented 2 years ago

+ATTR_CHROMA: :lang docker

+BEGIN_SRC conf

I feel that will create too much chroma-related clutter in the Org source files. If there's an option of a central config, you only need to edit one config file and be done.

braoult commented 2 years ago

+ATTR_CHROMA: :lang docker

+BEGIN_SRC conf

I feel that will create too much chroma-related clutter in the Org source files. If there's an option of a central config, you only need to edit one config file and be done.

For me, these are two different things: A mapping when, for a given org-mode language, there is an unique corresponding chroma language (we could use a mapping config file, or even hard-code it, as hugo does), AND when a given org-mode language could be different chroma languages (conf -> Docker | INI | ApacheConf ...).

I would like to add that, even with no mapping table, the SRC block param (let's forget about ATTR) allows to do everything. A mapping table is a "plus" (that I like), but maybe it could be a separate project, as we would need first to find-out the mappings themselves, which is maybe not so easy.

niklasfasching commented 2 years ago

Alright - both the centralized mapping config and the inline language override have value. The inline override is easier to implement and more flexible (the centralized mapping breaks down for e.g. conf mode - the mapping depends on the actual content) so I'd opt for that.

Name suggestions? I don't like chroma-lang too much as chroma is an implementation detail - highlight-lang?

braoult commented 2 years ago

Which name are you talking about ? I am not sure to understand what you mean.

kaushalmodi commented 2 years ago

@braoult I think it's the name of the key where we specify the lang name value (see below).

@niklasfasching If you are adding support for inline language override, this might work too:

#+header: :export-lang docker
#+begin_src conf
..
#+end_src

Another name suggestion: export-lang because that's the lang that will be exported vs what's seen for major mode highlighting in the Org src block.

#+header is a generic keyword. I think that regular Org exporters will not use that #+header keyword value for src blocks. That needs to be tested.

braoult commented 2 years ago

hmm... Imagine we have another implementation (I mean not Chroma), with totally different language mapping. How could we specify both of them with this syntax (I mean how could we have the same org file being used whatever implementation we use) ?

What about something like :

#+ATTR_GO_ORG: :lang-chroma docker :lang-implem2 conf_file_1
#+BEGIN_SRC conf
...
#+END_SRC

Or even:

#+ATTR_GO_ORG: chroma:docker implem2:conf_file_1
#+BEGIN_SRC conf
...
#+END_SRC

Edit: Ooops. I totally forgot we said ATTR is not a good solution (for inline source blocks). So maybe we could use some parameters such as :go-org-lang-chroma and :go-org-lang-implem2 ?

tecosaur commented 1 year ago

Heyo, just browsing the issues and I came across this. I think I might be able to help. Org has a bunch of ways of translating #+begin_src LANG to different modes via org-src-lang-modes. This provides a mapping from LANG to the relevant mode, for example by default it has:

(("C" . c)
 ("C++" . c++)
 ("asymptote" . asy)
 ("bash" . sh)
 ("beamer" . latex)
 ("calc" . fundamental)
 ("cpp" . c++)
 ("ditaa" . artist)
 ("desktop" . conf-desktop)
 ("dot" . graphviz-dot)
 ("elisp" . emacs-lisp)
 ("ocaml" . tuareg)
 ("screen" . shell-script)
 ("shell" . sh)
 ("sqlite" . sql)
 ("toml" . conf-toml))

If go-org were to support a similar customisation, just switching out the language as prescribed, I think that would be fairly reasonable.

niklasfasching commented 1 year ago

I'm not sure how to go about this issue. A 1:1 mapping as described would not handle the initial request of highlighting a conf src block in the correct language (bc there's no 1:1 mapping and conf could be docker or space or ... - emacs org mode just does more work here that I don't want to implement myself).

Do I understand the consensus right as:

  1. 1:1 mapping
  2. allow override via :export-lang src parameter

?

I prefer 2 just because 1 is tied to the highlighting implementation and left abstract in go-org - but then again no strong feelings (anymore) as chroma is so widespread.

@braoult what would you prefer?

niklasfasching commented 1 year ago

With https://github.com/niklasfasching/go-org/pull/85 we can customize chrome behavior based on src parameters - i.e. both a 1:1 lang mapping (1) and :export-lang or whatever (2) can be implemented in HighlightCodeBlock.

Closing for now as no changes to go-org are required to make this work - feel free to re-open if you want to continue discussing keeping such a mapping in this repo or anything else; but doesn't feel like there's anything to be done for now.

tecosaur commented 1 year ago

Could the default mapping I described in https://github.com/niklasfasching/go-org/issues/71#issuecomment-1300414699 be applied in this repo?

niklasfasching commented 1 year ago

Definitely - as long as we keep it isolated and e.g. just add a OrgChromaMapping var or smth, I'm happy for a PR :)

braoult commented 1 year ago

Do I understand the consensus right as:

  1. 1:1 mapping
  2. allow override via :export-lang src parameter I prefer 2 just because 1 is tied to the highlighting implementation and left abstract in go-org - but then again no strong feelings (anymore) as chroma is so widespread.

@braoult what would you prefer?

I am so sorry, I totally missed this comment... My preference would be (2) too.

niklasfasching commented 1 year ago

Sounds good. Since #85, (2) should be as simple as

HighlightCodeBlock: func(source, lang string, inline bool, params map[string]string) string {
  if exportLang, ok := params[":export-lang"]; ok {
    lang = exportLang
  }
// [...]

So I'd opt for leaving that up to the user. Thoughts?

braoult commented 1 year ago

Sounds good. Since #85, (2) should be as simple as

HighlightCodeBlock: func(source, lang string, inline bool, params map[string]string) string {
  if exportLang, ok := params[":export-lang"]; ok {
    lang = exportLang
  }
// [...]

So I'd opt for leaving that up to the user. Thoughts?

Perfect for me, hard-coding mapping in go-org itself would be a never-ending story.

tecosaur commented 1 year ago

Perfect for me, hard-coding mapping in go-org itself would be a never-ending story.

It would it you tried to account for user customisation. I didn't suggest that, I suggested using the mapping that Org itself ships with.

braoult commented 1 year ago

[...] I suggested using the mapping that Org itself ships with.

How would you do this ? The org mapping may be extended/changed at will (and anyway this would not be suitable for #71, as there is no mapping for the different conf-mode formats - see https://github.com/niklasfasching/go-org/issues/71#issuecomment-1047240974).

tecosaur commented 1 year ago

The org mapping may be extended/changed at will

Indeed, some users do extend it, but using the default value is still sensible to use.

and anyway this would not be suitable for https://github.com/niklasfasching/go-org/issues/71, as there is no mapping for the different conf-mode formats

Sure, it won't handle all the variations, but it will handle some of them. For example, the default org-src-lang-modes has the following entries:

 ("desktop" . conf-desktop)
 ("toml" . conf-toml)

Besides which, I don't see how "this won't handle all use cases" is an argument against something that will improve some use cases. That sounds like the perfect being the enemy of the good :slightly_smiling_face:.

braoult commented 1 year ago

What I meant is that conf block is the original org mode, and the real format (subtype ?) is detected by emacs itself based on block contents. There is no mapping here. If we create one, I believe it will be a never-ending story (compared to a simple export parameter), and would mean that we need the mapping both in Emacs and go-org side. And it would also mean that conf blocks should be changed to something else, i.e. we would simply break the org document (I could not send anymore my org file to someone else, as his/her Emacs configuration would not handle the new foo-conf mode). Or did I misunderstand something ?

braoult commented 1 year ago

Sounds good. Since #85, (2) should be as simple as


HighlightCodeBlock: func(source, lang string, inline bool, params map[string]string) string {
  if exportLang, ok := params[":export-lang"]; ok {
    lang = exportLang
  }

What about :export-lang-chroma, :export-chroma or something similar (as the value will be a real chroma format) ?