minoki / cluttex

Process LaTeX documents without cluttering your working directory
GNU General Public License v3.0
47 stars 3 forks source link

makeglossaries with abbreviations/acronyms #14

Open atticus-sullivan opened 1 year ago

atticus-sullivan commented 1 year ago

--makeglossaries doesn't work with other extensions than .glo/.glg/.gls. Therefore, it can't be used for acronyms or other glossaries.

I'd work on this, but not right now. Just wanted to file this "bug/feature".

I'd have to read the glossaries documentation more carefully if there are some common extensions (I think so). Then maybe we should activate these by default and allow the user to add more extensions to it.

atticus-sullivan commented 1 year ago
Seems that there are some common ones: Type log output input
acronyms .alg .acr .acn
symbols .slg .sls .slo
numbers .nlg .nls .nlo
index .ilg .ind .idx

With the following file types:

At least that's what I found in the documentation of the glossaries (and glossaries-extra, but didn't found additional ones there) package.

So maybe it would be worth it to add these extensions by default as well. In addition, we should somehow enable the user to add own extensions/-pairs.


I've seen that for building the glossary, there exist three/four possibilities. Currently using options 1-3 (tex-driven, makeindex and xindy) can be used easily with cluttex. Options 2 and 3 really easy with makeindex(-lite).

Some additional thoughts:

Maybe it would be better not to add all extensions by default so that the current behavior sticks if nothing is changed. If the user specifies additional extensions, then multiple commands have to be specified (question is still how).


So I'm still unsure how to do this the nice way.

atticus-sullivan commented 1 year ago

Currently I'm planning to implement something like this:

The interface will be passing formatted strings like this type[makeindex/xindy]:log:output:input:path-to-command:command args[optional] via a new commandline argument (keep being backwards compatible,--makeglossaries would be an incompatible (?) orthogonal option as I plan on using makeindex and/or xindy directly). This argument will push tables like

-- store struct
local s = {
  type = "",
  log = "",
  out = "", -- infer from log (replace last char by 's')
  inp = "", -- infer from log (replace last char by 'o')
  path = "", -- try PATH
  args = "", -- infer from log, out and inp
  cmd  = "" -- built at last
}

to a list of glossary configurations (support for multiple glossaries). We then just watch for changes in the respective inp files, if so we run the generated cmd to build the respective out file.

Quick demo code of the parsing ```lua local function split(str) local ret = {} local i = 1 str = str..":" while true do local char = "\\" local s,e = i,i while char == "\\" do _,e = str:find("[^:]-:", i) if not e then break end char = str:sub(e-1, e-1) i = e+1 end if not e then break end table.insert(ret, str:sub(s,e-1):gsub("\\:", ":").."") end return ret end print(table.concat(split("alg:acr:acn"), ", ")) print(table.concat(split("slg:sls:slo"), ", ")) print(table.concat(split("nlg:nls:nlo"), ", ")) print(table.concat(split("ilg:ind:idx"), ", ")) print("\nUSAGE: 'type[makeindex/xindy]:log:output:input:path-to-command:command args[optional]'\n") -- -- store struct -- local s = { -- type = "", -- log = "", -- out = "", -- infer from log (replace last char by 's') -- inp = "", -- infer from log (replace last char by 'o') -- path = "", -- try PATH -- args = "", -- infer from log, out and inp -- cmd = "" -- built at last -- } local function loadCfg(str) local s = split(str) assert(#s >= 2 and #s <= 6, "wrong input") local ret = {} ret.type = s[1] assert(ret.type == "makeindex" or ret.type == "xindy" or (#s == 6 and s[5] ~= "")) ret.log = s[2] if #s >= 3 and s[3] ~= "" then ret.out = s[3] else ret.out = ret.log:sub(1,-2).."s" end if #s >= 4 and s[4] ~= "" then ret.inp = s[4] else ret.inp = ret.log:sub(1,-2).."o" end if #s >= 5 and s[5] ~= "" then ret.path = s[5] else ret.path = ret.type end if #s >= 6 then ret.args = s[6] else if ret.type == "makeindex" then ret.args = ("-t '%s' -o '%s' '%s'"):format(ret.log, ret.out, ret.inp) -- TODO shell escaping elseif ret.type == "xindy" then ret.args = ("-t '%s' -o '%s' '%s'"):format(ret.log, ret.out, ret.inp) -- TODO shell escaping else error("invalid state") end end -- build command ret.cmd = ret.path.." "..ret.args return ret end local function printCfg(c) print(("type: %s, log: %s, in: %s, out: %s, path: %s, args: %s, cmd: \"%s\""):format(c.type, c.log, c.inp, c.out, c.path, c.args, c.cmd)) end print("makeindex:main.ilg:main.ind:main.idx =>") local c = loadCfg("makeindex:main.ilg:main.ind:main.idx") printCfg(c) ```

As I'm not sure what we currently parsing from the aux file and how to extract the bib2gls information, I refrain from implementing bib2gls specific stuff for now.

The current plan is to give all possibilities to the user (adjust the basename of the glossaries files as the filename has to be given, not just the extension, add arbitrary options to the called executable and use executable not contained in PATH).

Even if the type is unknown, it suffices to provide commandline arguments and the path to the executable and the filenames (so that we can watch for changes) to get it working (at least that's what is planned).

If there are any thoughts on the plan I outlined so far just let me know (here in this issue).

minoki commented 1 year ago

I'm looking at .aux file, and there's lines like \@newglossary{main}{glg}{gls}{glo}. I wonder if they could be parsed to detect input/output files. What do you think?

atticus-sullivan commented 1 year ago

Yes you're right, in my case it is \@newglossary{acronym}{alg}{acr}{acn}. Just checked for how currently (in #15) the option is passed (--glossaries="makeindex:main.gls:main.glo:main.glg") I see that one is more configurable in terms of wether to use makeindex or xindy. I'm not completely sure whether this maybe a bit more complex but also more flexible configuration is worth it. I think people who figure out how to set up additional manual glossaries also can find this option (but there might be some that just copy-pasted the glossary setup from tex.stackexchange). Maybe we can go on, check the .aux file for the common ones like glo/acn and display a hint on how to set these things up.

But when I think about it, that's almost like using the information from the .aux file (in terms of complexity, so we might just use the information from the .aux and if the user specified something else for that combination of input-/output-/log-files we can use that).

Any thought at which part of the code such parsing can take place? (iirc we're already parsing the .aux file at some point).