astoff / digestif

A language server for TeX and friends
Other
255 stars 6 forks source link

Extend ConTeXt support for sections #15

Closed flying-sheep closed 4 years ago

flying-sheep commented 4 years ago
astoff commented 4 years ago

Nice, I'll review this shortly.

There's something in this script that's not quite deterministic, so different runs give seemingly equivalent but different outputs (hence 61k additions, 44k deletions). If you could find out why, it would be really great.

make the language server aware of the section hierarchy

Well, for which functionality would you like to use this? As of yet, nothing uses the section level information.

support \start<section>[title={...}] versions

I'm not sure yet this is the ideal way to deal with startstop stuff, but in principle \startsomething is governed by the environments.something entry of the data file.

flying-sheep commented 4 years ago
  1. The script is deterministic. The diff algorithm just isn’t very good, and my change to include all alternatives added a lot of commands:

    $ git diff --stat master -- context.tags
    data/context.tags | 106618 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------------------
    1 file changed, 62216 insertions(+), 44402 deletions(-)
    $ git diff --stat --diff-algorithm=minimal master -- context.tags
    data/context.tags | 17966 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
    1 file changed, 17890 insertions(+), 76 deletions(-)

    ConTeXt’s “alternatives” are instances of commands with some parameters pre-filled. E.g. all headings are just instances of \section with different levels, different styles, and some with no numbering.

    They could of course be represented more minimally in the data file if you want.

  2. I’m cheating, as I actually get the following error when running the script. Any idea why my addition to the utils file doesn’t work?

    lua: ../scripts/extract-context.lua:162: attempt to call a nil value (field 'has_value')

    Am I using the wrong digestif.utils because I also have digestif installed system wide?

  3. As of yet, nothing uses the section level information.

    We should use the symbol hierarchy to have an outline for navigation.

astoff commented 4 years ago

1. The script is deterministic.

Thanks for checking!

2. Am I using the wrong digestif.utils because I also have digestif installed system wide?

Probably. The order in which modules are loaded in Lua also seems a bit weird to me. In this specific case, I think has_value is specialized enough that it could stay inside the extract script.

3. We should use the symbol hierarchy to have an outline for navigation.

Yes, that's probably the next most interesting feature to add.

Are you interested in cross-reference and bibliography support? For this, it would be necessary to add action fields with value "label", "ref", "cite", "bibitem" to the commands that behave like the corresponding LaTeX commands. This is a lower-hanging fruit, since we can reuse most of the existing code for LaTeX.

flying-sheep commented 4 years ago

Well, I also don’t have much time, as I rather spend time on my thesis than this at the moment, so figuring this out isn’t my first priority, sorry. Maybe some hours at the weekend.

But here’s all I know to make it easier:

labels and references

creating labels

% section headers, \command style
\chapter[preface]{Dear Reader}
% section headers,  \startenvironment style
\startsection[reference=methods, title={Methods}]
% direct label creation
\pagereference[reference]
\textreference[reference]{text}

referencing labels

\in{chapter}[preface] % chapter 2.2
\at{page}[preface]    % page 24
\about[preface]       % “Dear Reader”
% after the following, \insec[reference]/\atsec[ can be used
\definereferenceformat[insec][text=section]

referencing labels in other documents

See here

bibliographies and citations

linking bibliography datasets

Standard or named

\usebtxdataset[filename.bib]
% or
\usebtxdataset[standard][filename.bib]
% or
\definebtxdataset[name]
\usebtxdataset[name][filename.bib]

Besides a .bib file, there are the possibilities buffer, XML, Lua (see here)

\startbuffer[bufname]
...
\stopbuffer
\usebtxdataset[...][bufname.buffer]

citing

% deprecated
  \cite{key}
\nocite{key}

%supported
\cite[scheme][key]
\nocite[key]

\citation[scheme] [key]
\nocitation[key] %or \usecitation[key]

% people also do things like
\def\inlinecite{\cite[authoryears]}
astoff commented 4 years ago

Is the stuff concerning sectioning ready? In this case I would like to merge! The other types of commands can be done later (by either of us).

flying-sheep commented 4 years ago

Not yet, I still have to

  1. figure out how to write tests to
  2. parse the \start<sectionkind>{title=...} form, and then
  3. actually do something with the information

/edit. Tests are in haskell. I can’t (really) code in haskell and have no idea how to set it up in Arch and install the necessary libs…

astoff commented 4 years ago

So, concerning \startsection and similar forms, the procedure is to add an entry environments.section to the tags file. This entry can be identical to commands.section. Alternatively, "links" can be used, which would look like this:

environments = {
  section = "$DIGESTIFDATA/context/commands/section"
  subsection = "$DIGESTIFDATA/context/commands/subsection"
}

I'd be happy(er) with one PR just to set up the data files, and leave further work that actually do something with the information (including the necessary tests) for future PRs.

Once the data files are there, I personally would work out the cross-references and bibliography part.

PS: I just noticed that the startstop forms for section have different arguments, so linking can't be used in this case.

flying-sheep commented 4 years ago

Okay, then this should be almost ready, as e.g. environments.chapter already has action and section_level set correctly.

One question: I could link section instances to their main command, if it wasn’t for section_level. Any way around that?

flying-sheep commented 4 years ago

Other thoughts:

For the keys to use, check this. I think the fallback sequence for what to use as LSP DocumentSymbol should be bookmarklisttitle (as PDF bookmarks are used and displayed similarly as DocumentSymbols), and reference can be used for supplying, well, reference targets.

   chapter = {
      action = "section",
      arguments = {
         {
            delimiters = "$DIGESTIFDATA/context/data/brackets",
            keys = {
               bookmark = { meta = "text" },
               list = { meta = "text" },
               marking = { meta = "text" },
               ownnumber = { meta = "text" },
               reference = { meta = "reference" },
               title = { meta = "text" }
            },
            list = true,
            meta = "assignments",
            optional = true
         },
         {
            delimiters = "$DIGESTIFDATA/context/data/brackets",
            keys = { ["cd:key"] = { meta = "value" } },
            list = true,
            meta = "assignments",
            optional = true
         }
      },
      section_level = 2
   },

For tests I came up with this so far, but I think that’s trivial:

syms_or_infs <- getDocumentSymbols doc
liftIO $ case syms_or_infs of
  Right infs -> fail (printf "Expected [DocumentSymbol] not %s" infs)
  Left syms -> syms `shouldBe` [...]
astoff commented 4 years ago

One question: I could link section instances to their main command, if it wasn’t for section_level. Any way around that?

No. What could be done is linking e.g. commands.subsection.arguments[1].keys to commands.section.arguments[1].keys. This in fact applies to all commands that inherit key-value arguments from somewhere else without adding anything new. But that's just an optimization.

what to use as LSP DocumentSymbol

I thought it would be the reference! But I guess this depends on what the DocumentSymbol interface will look like on the editor. What is your expectation? Do you want to see the document sections in a tree structure?

astoff commented 4 years ago

By the way, you pointed out to the wiki, so let me ask: Is there a way we could get a docstring for every command and keyword without hard work like parsing wiki markup or HTML? I couldn't find any.

Another cool thing would be to have a PDF bookmark to the place each command is described in the manual, like we have already for PGF/TikZ. When I checked, the manual had no permanent-looking bookmarks.

flying-sheep commented 4 years ago

What is your expectation? Do you want to see the document sections in a tree structure?

DocumentSymbol is for TOCs: Every editor represent them as a tree of collapsible headers that you can click to jump to a thing. I use them mostly to jump around in my document or code. About the reference:

detail: String | null
More detail for this symbol, e.g the signature of a function.

I think adding it as DocumentSymbol.detail would make a lot of sense!


Is there a way we could get a docstring for every command and keyword without hard work like parsing wiki markup or HTML?

Nope, I asked them and they said:

No, there isn’t. It could be in the interface files if someone would put in the work.

astoff commented 4 years ago

A problem I see with the documentSymbol request is that, in the LSP spec, a "document" is a single file. So it doesn't seem that LSP allows to show the outline of an entire book if it's split into several files.

There is also a workspace/symbol request, but this one is for a non-hierarchical list of things, so it's also not ideal for this use-case.

Anyway, the way to proceed here is to come up with the best document-outline API for TeX, and then map it onto the LSP spec in the best possible way, even if that's suboptimal.

flying-sheep commented 4 years ago

I don’t see a problem. Yes, it’s per-file. There’s a clear hierarchy defined for LaTeX and ConTeXt each, which can be parsed out of each file.

The workspace/symbol request is of seems useful as a project-wide “go to header/figure” search

astoff commented 4 years ago

@flying-sheep Do you think this can be merged as is now, or with just some quick adjustments? I think we should leave the document outline functions for later.

flying-sheep commented 4 years ago

Sure, I mean it doesn’t do much currently, but it certainly adds correct metadata