lierdakil / pandoc-crossref

Pandoc filter for cross-references
https://lierdakil.github.io/pandoc-crossref/
GNU General Public License v2.0
936 stars 74 forks source link

Org-reader subfigs feature #427

Open awelormro opened 8 months ago

awelormro commented 8 months ago

Hello!

Sorry to bother again, but I was trying to work up with the subfigures feature, so... I've been working with a lua filter to read them helped with the figure environment, read as a div feature, here is what I've made previously

-- Pandoc filter to create subfigs
local function starts_with(start, str)
  return str:sub(1, #start) == start
end

function Div(div)
  if div.classes[1]=='figure' then
    --Leer si el primer y segundo elementos son rawblocks
    if div.content[1].t=='RawBlock' and 
       div.content[2].t=='RawBlock' then
       -- Generate id with the first raw block 
       local divi=div.content[1].text
       div.identifier=divi
       return div
    end
  elseif div.classes[1]=='subfigure' then
    if div.content[1].t=='Figure' then
      local imid= div.content[1].identifier
      local imag=div.content[1].content[1].src
      return imag
    else
      return div
    end
  else
    return div
  end
end

I've been trying to use the following file example.org to test:

#+begin_figure
#+LABEL: fig:subfigures
#+CAPTION: Subfigures Caption

#+begin_subfigure
#+CAPTION: Subfigure a
#+LABEL: fig:subfigureA
[[file:img1.jpg]]
#+end_subfigure
#+begin_subfigure
#+CAPTION: Subfigure b
#+Label: fig:SubfigureB
[[file:img2.jpg]]
#+end_subfigure
#+end_figure

With that filter after executing it to native I obtain:

[ Div
    ( "#+LABEL: fig:subfigures" , [ "figure" ] , [] )
    [ RawBlock (Format "org") "#+LABEL: fig:subfigures"
    , RawBlock (Format "org") "#+CAPTION: Subfigures Caption"
    , Div
        ( "" , [ "subfigure" ] , [] )
        [ RawBlock (Format "org") "#+subfigure_CAPTION: Subfigure a"
        , RawBlock
            (Format "org") "#+subfigure_LABEL: fig:subfigureA"
        , Para [ Image ( "" , [] , [] ) [] ( "img1.jpg" , "" ) ]
        ]
    , Plain [ Image ( "" , [] , [] ) [] ( "img2.jpg" , "" ) ]
    ]
]

I want to know if there is a way to add the source from the class and the caption description to use it with crossref, I clearly know that the objective output comes with the following markdown example:

<div id="fig:subfigures">
![subfigure a](img1.jpg){#fig:subfigurea}
![subfigure b](img2.jpg){#fig:subfigureb}

subfigures caption
</div>

I must get the following output exported to native format:

[ Div
    ( "fig:subfigures" , [] , [] )
    [ Para
        [ Image
            ( "fig:subfigurea" , [] , [] )
            [ Str "subfigure" , Space , Str "a" ]
            ( "img1.jpg" , "" )
        , SoftBreak
        , Image
            ( "fig:subfigureb" , [] , [] )
            [ Str "subfigure" , Space , Str "b" ]
            ( "img2.jpg" , "" )
        ]
    , Para [ Str "subfigures" , Space , Str "caption" ]
    ]
]

I hope you can help me to add some crossref features to another extensions. Thanks!

lierdakil commented 8 months ago

I'm not familiar with orgmode, but from what I do know, shouldn't block attributes be specified before the begin_ thing?

FWIW, pandoc has some (poorly-documented) hacks to interpret org-mode as something similar to divs. For example, this kinda works without any additional filters:

#+attr_html: :id fig:foo
#+begin_figure

#+label: fig:subfigure1
#+caption: Subfigure Caption 1
[[file:file1.png]]
#+label: fig:subfigure2
#+caption: Subfigure Caption 2
[[file:file2.png]]

Subfigures caption
#+end_figure

Notice the weird attr_html thing, that one works, while name or label are simply eaten by the parser. I don't know org-mode semantics, but it seems like it might be a pandoc bug.

I don't really have the bandwidth to dedicate to making your filter work, I'm not sure if it can work properly at all without patching pandoc, given it seemingly eats caption attribute on divs, e.g. this

#+caption: foo
#+begin_figure
#+end_figure

produces

[ Div ( "" , [ "figure" ] , [] ) [] ]

As you may notice, the caption attribute is entirely gone. You can hack around with this:

#+attr_html: :caption foo
#+begin_figure
#+end_figure

which produces

[ Div ( "" , [ "figure" ] , [ ( "caption" , "foo" ) ] ) [] ]

But that's at best clunky.

The exact place where all this dark magic happens is here, include a reference to it if you're going to open an issue upstream.

There is, however, another issue with org-mode. As far as I can tell, at least as far as pandoc is concerned, you can't have inline images with captions with org-mode. This is fine if you're not going to use subfigGrid option, but if you intend to, each paragraph with inline images is interpreted as a row in the grid, so you won't be able to put more than one subfigure per row unless this shortcoming (i.e. inability to have inline images with captions/alt text) is addressed somehow on the pandoc side (or via an inventive use of attributes with a prefilter, but that's an ugly hack at best)

awelormro commented 8 months ago

Totally forgot the attr_id feature, Thanks a lot! I don't really abuse on the subfigures attribute, but at least, for my paper publication workflow kinda need two or three subfigures in certain cases, Thanks a lot! It makes the exact thing I wanted to do: Usage of subfigures feature in crossref :) Also, with this, I'm pretty able to use certain stuff that I forgot how to use, it's not only the orgmode magic, or the pandoc magic, it's a wizardry crash course to make all work together xD But seriously, thanks for the effort and patience to explain stuff like this, I promise to make an explanation guide on how to work with all the features that crossref has with orgmode if you might consider it helpful

lierdakil commented 8 months ago

I'm open to accepting pull requests to the documentation, so if you want to add a section on using org-mode as an input format, either for your own future reference or as a public service, be my guest, I'm sure that'll be helpful to someone.

That being said, I personally don't use org-mode, and there aren't any tests for org-mode at the moment, so I can't guarantee nothing breaks down the line. FWIW, I will also accept a PR adding org-mode tests if you're up for it. You can use tests/m2m as a template, converting input.md into org-mode (if possible/applicable). Those will need a harness to run (see test/test-integrative.hs), but I can handle that if you're not familiar with Haskell.