ctargett / refguide-asciidoc-poc

Proof of concept of Solr Ref Guide converted to asciidoc format & using Asciidoctor for publishing
2 stars 4 forks source link

do something better with <br/> tags #31

Open hossman opened 7 years ago

hossman commented 7 years ago

This code has been in the jsoup cleanup code since before i started working with it...

// remove breaks -- TODO: why?
elements = docOut.getElementsByTag("br");
for (Element element : elements) {
  element.remove();
}

...and is currently the cause of some really ugly formatting in some files -- notably collections-api.adoc -- but a quick experiment in removing it and regenerating the adoc files doesn't look like a slam dunk: that may cause more problems then it solves.

Filing this issue as a reminder to look into this later -- even if it's not worth trying to make the code smarter, it's probably worth making a list of affected pages / sections so we can audit them after final conversion

ctargett commented 7 years ago

The API pages will all have this problem, and I'm content to leave them to cleanup.

I'd be interested to know if you've seen other areas where it's a problem, besides the lists of API calls that aren't actually lists.

hossman commented 7 years ago

I've created the keep-br-tags branch as a way to quickly compare what the changes would be if we stoped stripping br tags as part of the conversion -- see 6f2ca17c1cebe57c7641e6eb140d63abcd6868e9 for a list of impacted files.

There are lots of places where leaving these br tags in makes the formatting worse, here's a short summary of places where it obviously seems to makes the formatting better...