JacquesCarette / Drasil

Generate all the things (focusing on research software)
https://jacquescarette.github.io/Drasil
BSD 2-Clause "Simplified" License
140 stars 26 forks source link

HTML printing issue #3054

Closed tingyuw closed 1 year ago

tingyuw commented 2 years ago

After we flatten the section to linear structure, HTML is not printing properly because contents should be nested in HTML. It's not that a big issue, just the indents for subsections are missing. Instead of

image

it's now printing linearly like this, which makes the contents not displaying well.

image

I have some thoughts about it and am trying to write them down. If we are sticking with the new structure, I think printers should be the place where most of the adjustments are made. I haven't thought much about the implementation yet, just some random ideas.

My first thought is to have both linear and nested structures in the source language. Since sometimes we need nested structure (e.g., table of contents, HTML) and sometimes we want them linearly (e.g., notebook and LaTex). The problem of it is that when we are building an example, we have to build the document twice, one in linear structure and one in nested structure. I don't think is something we are looking for and it's not quite efficient and user friendly, at least for now. (I was talking with Jason last week and he mentioned that having an UI for Drasil would be good, maybe this won't be an issue when we have the UI, we can just transfer the input contents into two different structures :-)

I was also thinking about updating the LayoutObj, adding a new type like SubSection to distinguish between main- and sub-section. Currently the layout object for sections is HDiv (I think it's built for HTML at first?), all sections go into this category whether it's a main section or a subsection. I was thinking if we can differentiate the sections maybe it would be easier to wrap subsections in the main one. But then it occurred to me that we actually do know whether it's a subsection or not by its layer number. Then the next question would be: how can I print the subsections inside the section? I'm not sure if this can be solved just by adjusting the printer and how we print the contents, or it is an issue at a higher structural level. If it's not a simple printing issue, then I will have to make those sections nested to pretty print the contents. I think for this we need to adjust the source language a little bit as well to indicate the parent of each section, so we can know how to re-structure them in the printer or somewhere else. I remember we talked about that printer should be straightforward and not doing too complicated work. So if it's something at a higher level, then maybe we should not do it in the printer.

Still thinking.

tingyuw commented 2 years ago

Notes from the meeting today:

  1. Add parent pointer in the source language
  2. Create a tree in the printer (maybe Import.hs?) and traverse it to print the contents
  3. One data structure (linear) but both information should be available
tingyuw commented 2 years ago

I added an UID and depth in the Section data for indicating the parent and the section level.

data Section = Section 
             { par  :: UID
             , dep  :: Depth 
             , tle  :: Title 
             , cons :: [SecCons]
             , _lab :: Reference
             }
makeLenses ''Section

I'm still trying to build a tree in the printer but have not much progress so far. I go over the code and I think it might be possible to print the HTML as we want by updating the current printer with the parent and level information and without building a tree. I think changes will be made in Import -> Document

-- | Translates from 'Document' to a printable representation of 'T.Document'.
makeDocument :: PrintingInformation -> Document -> T.Document
makeDocument sm (Document titleLb authorName _ sections) =
  T.Document (spec sm titleLb) (spec sm authorName) (createLayout sm sections)

-- * Helpers

-- | Helper function for creating sections as layout objects.
createLayout :: PrintingInformation -> [Section] -> [T.LayoutObj]
createLayout sm = map (sec sm)

-- | Helper function for creating sections at the appropriate depth.
sec :: PrintingInformation -> Section -> T.LayoutObj
sec sm x@(Section _ depth titleLb contents _) = --FIXME: should ShortName be used somewhere?
  let refr = P.S (refAdd x) in
  T.HDiv depth [concat (replicate depth "sub") ++ "section"]
  (T.Header depth (spec sm titleLb) refr :
   map (layout sm depth) contents) refr

and in HTML -> Print

-- | Helper for rendering layout objects ('LayoutObj's) into HTML.
printLO :: LayoutObj -> Doc
printLO (HDiv _ ["equation"] layoutObs EmptyS)  = vcat (map printLO layoutObs)
printLO (HDiv _ ts layoutObs EmptyS) = divTag ts (vcat (map printLO layoutObs))
printLO (HDiv 0 ts layoutObs l) = refwrap (pSpec l) $
                                 divTag ts (vcat (map printLO layoutObs))  
printLO (HDiv n ts layoutObs l) = refwrap (pSpec l) $
                                 divTag ts (vcat (map printLO layoutObs))

These two parts are for creating and printing sections. I was wondering maybe we can somehow map the subsection first and 'wrap' them inside their parent (but I'm not sure how to do it from this point). If we pass both parent and level information to HDiv would that help?

@JacquesCarette Do you think it's feasible from a coding perspective that we can print the HTML by making changes to the current printer or you think a tree will still be required? Even if we build the tree, printers still need to be updated accordingly to iterate over it, but I think it needs more additional works.

JacquesCarette commented 1 year ago

Going through older things, I just noticed that I had never replied to this - sorry!

Indeed, there are two ways to do this. First would be to create a new structure, like LayoutObj but closer to the needs of printing, that does use nesting. The other would be to enhance parts of LayoutObj with level information and parent information; just level is not enough, as you can't guarantee that things will come in the right order in the 'flat' version.

Since it's been quite a while, and we did talk about this in a meeting - are you still stuck on this, or did you go ahead and implement one solution?

tingyuw commented 1 year ago

No, I have no progress in this. I've been working on generating the SRS example in JSON format. I thought I might need to make changes after your reply so I'm holding it a little bit.

The first approach is similar to building a tree as the new structure as we talked about, right? Second approach sounds more intuitive to me. I've added both level and parent information in Section so I just have to pass that information to the printer. But I guess my question would be, I'm not sure where and how to modify the code to 'wrap' the subsections. However, the first approach makes more sense to the current printer because LayoutObj is how we want to print the document, so we don't have to make too many changes in the printer. It's intuitive once we translate the contents to layout objects.

JacquesCarette commented 1 year ago

I agree with your description of the two approaches.

The one thing that might be missing, in both of them, is the 'sibling' order that is free now in the nested version.

Here's one way to think of it:

One way of thinking about the process is to insert all the sections into a "chunk database", sort it, and then print it from the start. You could put it into a LayoutObj after the sorting phase, to minimize changes to the printer.

tingyuw commented 1 year ago

I think the biggest problem here is when to nest a section. Before, the printer doesn't have to really think about when to nest the sections because they are already nested, the only thing to deal with is when to indent, which is why we have this function:

-- | Helper function for creating sections at the appropriate depth.
sec :: PrintingInformation -> Section -> T.LayoutObj
sec sm x@(Section _ depth titleLb contents _) = --FIXME: should ShortName be used somewhere?
  let refr = P.S (refAdd x) in
  T.HDiv depth [concat (replicate depth "sub") ++ "section"]
  (T.Header depth (spec sm titleLb) refr :
   map (layout sm depth) contents) refr

The old structure is nested, subsections were "inside" their parents, but this relationship is missing now in the new structure. We do have the information for indenting, which is the level information we assigned to each Section.

If we want to add information about order/siblings in each Section, doesn't that mean in some sense we need to have another structure? One of the tricky things is that we have subsubsections (and maybe subsubsubsection in the future), parent indication and order information don't seem enough?

I'm not sure I fully understand the part of inserting the sections into a chunk database. Why do we have to sort it if they are already built in order? How does this process solve the nesting/wrapping issue?

Still feels like having a nested structure is a must to me. I was brainstorming and have an idea: for now, we declare and define the sections in flatten structure, can we create a new nested structure and somehow "link" those nested sections to flatten ones so that we don't have to build the sections and contents twice. Does that even make sense?

JacquesCarette commented 1 year ago

[This was covered in a meeting, where answers were provided verbally. Only a summary here.]

The basic problem is that we don't want to force users to give us the recipe for an SRS as a literal tree. We want the users to give us enough information so that we can reconstruct the implied tree (that we can pass to the various printers).

There are many ways to get there - I outlined some above. But, in the end, any means that is not a big burden for the user, is 'compositional' and lets us reconstruct the tree will do just fine.

balacij commented 1 year ago

Since your work is completed, do you have an idea of what the final status of this ticket is, @tingyuw? :smile:

tingyuw commented 1 year ago

This issue is resolved.