jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.06k stars 3.35k forks source link

org image ATTR_HTML are lost if image is followed by link #8821

Open VladimirAlexiev opened 1 year ago

VladimirAlexiev commented 1 year ago

(A bit related to https://github.com/jgm/pandoc/issues/7583)

I want to make images sized to width=200px, followed by links, like here https://bsdd.ontotext.com/README.html#acknowledgements (made with emacs org): image

Instead the image size is lost and I get this https://bsdd.ontotext.com/new/#acknowledgements image

The reason is that the attributes of an image followed by link are emitted as markdown {=org} block, instead of true image attributes. Below are some tests, where I've added my comments as ###:

Version:

# pandoc -v
pandoc 3.1.2
Features: +server +lua
Scripting engine: Lua 5.4

# uname
CYGWIN_NT-10.0

Input org file:

# cat test7.org
Image with width, caption, placement:
  #+ATTR_HTML: :width 75% :placement [!htb]
  #+CAPTION: An example of a multi-domain network.
  [[./img/sample.png]]

Image with width:
  #+ATTR_HTML: :width 200px
  [[./img/sample.png]]

Image with width, followed by link:
  #+ATTR_HTML: :width 200px
  [[./img/sample.png]]
  [[https://graphdb.ontotext.com/][Ontotext GraphDB]]  ### this line makes trouble

Output with implicit_figures:

# pandoc test7.org -s -t markdown+implicit_figures
Image with width, caption, placement:

### kind of ok, but I asked for implicit-figures !
<figure width="75%" data-placement="[!htb]">
<img src="./img/sample.png" />
<figcaption>An example of a multi-domain network.</figcaption>
</figure>

Image with width:

### ok
![](./img/sample.png){width="200px"}

Image with width, followed by link:

### not ok, this {=org} code block does nothing. I have REMOVED one backtick so as not to confuse github
``{=org}
#+ATTR_HTML: :width 200px
``
![](./img/sample.png) [Ontotext GraphDB](https://graphdb.ontotext.com/)

Output without implicit_figures:

### the results are the same, no matter if I ask with or without implicit_figures

# pandoc test7.org -s -t markdown-implicit_figures
Image with width, caption, placement:

<figure width="75%" data-placement="[!htb]">
<img src="./img/sample.png" />
<figcaption>An example of a multi-domain network.</figcaption>
</figure>

Image with width:

![](./img/sample.png){width="200px"}

Image with width, followed by link:

``{=org}
#+ATTR_HTML: :width 200px
``
![](./img/sample.png) [Ontotext GraphDB](https://graphdb.ontotext.com/)
awelormro commented 3 weeks ago

Also the label for references is broken, I've made two lua filters (used AI to some parts of the api, not gonna lie) to help to solve this, the first one is to make visible the attr_html and links:

function process_blocks(blocks)
    local result = {}
    local i = 1

    while i <= #blocks do
        local block = blocks[i]

        -- Verificar si es el bloque RawBlock que contiene atributos HTML
        if block.t == "RawBlock" and block.format == "org" and block.text:match("#%+ATTR_HTML:") then
            -- Extraer todos los atributos que siguen a #+ATTR_HTML:
            local attributes = {}
            for key, value in block.text:gmatch(":(%w+)%s+([%w%p]+)") do
                attributes[key] = value
            end

            -- Verificar que el siguiente bloque sea un Para con una Imagen
            if i + 1 <= #blocks and blocks[i + 1].t == "Para" then
                local para = blocks[i + 1]

                -- Buscar la Imagen dentro del Para
                for j, inline in ipairs(para.content) do
                    if inline.t == "Image" then
                        -- Añadir los atributos extraídos a la imagen
                        for key, value in pairs(attributes) do
                            inline.attributes[key] = value
                        end
                    end
                end

                -- Añadir el Para modificado al resultado
                table.insert(result, para)
                i = i + 2 -- Saltar el RawBlock y el Para
            else
                -- Si no hay una Imagen después del RawBlock, solo añadir el bloque RawBlock
                table.insert(result, block)
                i = i + 1
            end
        else
            -- Añadir cualquier otro bloque sin modificación
            table.insert(result, block)
            i = i + 1
        end
    end

    return result
end

-- El filtro se aplica a los bloques del documento
return {
    {
        Blocks = process_blocks
    }
}

A second one to fix the labels and read attr html and the label,

function Figure(el)
        local vals = el.attributes
        -- Elimina los atributos del elemento Figure
        el.attributes = {}
        -- Accede a la imagen dentro de la figura
        local image = el.content[1].content[1]
    if el.identifier == "" then
        el.identifier = vals[1][2]
    end
        -- Modifica los atributos de la imagen
        image.attributes = vals
  print(el.identifier)

  return el
end

For the second one, the only condition if the label is put as inside the attr_html as an id, should be done something like this

Example with id inside the html attr:

#+CAPTION: Prueba bonita
#+ATTR_HTML: :id fig:figch :width 130px
[[file:encuesta.png]]

Example with the label separated:

#+CAPTION: Prueba bonita
#+LABEL: fig:fig1
#+ATTR_HTML:  :width 130px
[[file:encuesta.png]]

Just let me know if it helped :)