Closed fref closed 7 years ago
Would you please provide your example "code" to format (the original and the misformatted output), so I may debug the problem?
I'll send you that next week.
On 3 November 2016 at 19:37, Ralf Stuckert notifications@github.com wrote:
Would you please provide your example "code" to format (the original and the misformatted output), so I may debug the problem?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ralfstuckert/pdfbox-layout/issues/11#issuecomment-258235297, or mute the thread https://github.com/notifications/unsubscribe-auth/AAH_CFoOmoCArRtKhgTWm0SSICT7waN0ks5q6inrgaJpZM4KhLJ2 .
time flies, I've been overwhelmed at work, I'll make some time tomorrow
On 5 November 2016 at 07:42, Frédéric Donckels frederic.donckels@gmail.com wrote:
I'll send you that next week.
On 3 November 2016 at 19:37, Ralf Stuckert notifications@github.com wrote:
Would you please provide your example "code" to format (the original and the misformatted output), so I may debug the problem?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ralfstuckert/pdfbox-layout/issues/11#issuecomment-258235297, or mute the thread https://github.com/notifications/unsubscribe-auth/AAH_CFoOmoCArRtKhgTWm0SSICT7waN0ks5q6inrgaJpZM4KhLJ2 .
Here's some sample code which will exhibit various "arrangements" in the pdf I attached (various indentation attempts) TestLayoutIssue.pdf
As you can see, "higher" lines seem to be "randomly" inserted.
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentInformation;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.PDPageContentStream.AppendMode;
import org.apache.pdfbox.pdmodel.PDPageTree;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import rst.pdfbox.layout.text.Alignment;
import rst.pdfbox.layout.text.Position;
import rst.pdfbox.layout.text.TextFlow;
import rst.pdfbox.layout.text.TextFlowUtil;
import java.io.File;
import java.io.IOException;
import java.util.GregorianCalendar;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import static org.apache.pdfbox.pdmodel.font.PDType1Font.COURIER;
@SuppressWarnings({"MagicNumber", "Duplicates"})
public class AAAALayoutTest extends Sample {
private static final int FONT_SIZE = 7;
private static final PDRectangle LANDSCAPE_A4 = new PDRectangle(PDRectangle.A4.getHeight(), PDRectangle.A4.getWidth());
private static final Pattern PATTERN_LEADING_SPACES = Pattern.compile("^([ \t]+)", Pattern.MULTILINE);
private static final String PREFORMATTED = "---\n" +
"comment: Bla Bla bla\n" +
"flushType: RESCAN\n" +
"mandators:\n" +
"\"BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA," +
"B\n" +
"LA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,BLA,\"\n" +
"messageFilter: |\n" +
" ---\n" +
" -\n" +
" - !com.sample.test.commons.utilities.PropertyFilter\n" +
" operator: EQUALS\n" +
" propertyName: brokerMessageId\n" +
" sourceClass: StuffMessage\n" +
" value: !string \"00000000000000001-08-2015--13:58:55--2\"\n" +
"reasonCode: 29\n" +
"requesterId: sample2\n" +
"riskType: BLA.BLA.BLA\n" +
"states: \"1\"\n" +
"userProfile: &3\n" +
" description: Rights granted to edit everything\n" +
" lastEditTime: \"2015-02-27T12:24\"\n" +
" lastEditUser: sample3\n" +
" name: \"superuser \"\n" +
" rights: !org.hibernate.collection.PersistentSet\n" +
" - !com.sample.test.commons.smc.model.SMCProfileRights\n" +
" lastEditTime: \"2015-01-29T13:02\"\n" +
" lastEditUser: sample3\n" +
" mandator: &8 ALL\n" +
" profile: *3\n" +
" role: &9\n" +
" description: \"SUPER\"\n" +
" lastEditTime: \"2015-01-29T13:02\"\n" +
" lastEditUser: sample3\n" +
" version: &11 0\n" +
" version: 36\n";
private PDPageContentStream currentStream;
private PDDocument pdfDocument;
public PDPageContentStream getCurrentStream() {
return this.currentStream;
}
public PDDocument getPdfDocument() {
return this.pdfDocument;
}
@Override
public void run() {
this.pdfDocument = new PDDocument();
PDDocumentInformation info = new PDDocumentInformation();
info.setAuthor("Frédéric Donckels");
info.setTitle("Test Layout Issue");
info.setKeywords("Pdfbox Layout, Issue");
info.setCreationDate(new GregorianCalendar());
this.pdfDocument.setDocumentInformation(info);
try {
TextFlow indentedFlow;
String indentedText;
startNewPage();
indentedText = indentLeadingSpace1(PREFORMATTED);
indentedFlow = TextFlowUtil.createTextFlowFromMarkup(indentedText, FONT_SIZE, COURIER, COURIER, COURIER, COURIER );
indentedFlow.drawText(getCurrentStream(), new Position(100, 500), Alignment.Left, null);
startNewPage();
indentedText = indentLeadingSpace2(PREFORMATTED);
indentedFlow = TextFlowUtil.createTextFlowFromMarkup(indentedText, FONT_SIZE, COURIER, COURIER, COURIER, COURIER );
indentedFlow.drawText(getCurrentStream(), new Position(100, 500), Alignment.Left, null);
startNewPage();
indentedFlow = indentLeadingSpace3(PREFORMATTED);
indentedFlow.drawText(getCurrentStream(), new Position(100, 500), Alignment.Left, null);
flushPage();
File file = new File("TestLayoutIssue.pdf");
this.pdfDocument.save(file);
this.pdfDocument.close();
} catch (IOException e) {
e.printStackTrace();
}
}
protected void startNewPage()
throws IOException {
flushPage();
PDPage page = new PDPage(LANDSCAPE_A4);
this.pdfDocument.addPage(page);
this.currentStream = new PDPageContentStream(this.pdfDocument, page, AppendMode.APPEND, true);
}
private void flushPage()
throws IOException {
if (null != getCurrentStream()) {
getCurrentStream().close();
}
}
private PDPage getCurrentPage() {
PDPageTree pages = this.pdfDocument.getPages();
return pages.get(pages.getCount() - 1);
}
private String indentLeadingSpace1(String text) {
Matcher matcher = PATTERN_LEADING_SPACES.matcher(text);
StringBuffer buffer = new StringBuffer(text.length());
while (matcher.find()) {
String group = matcher.group();
int indent = 0;
for (char character : group.toCharArray()) {
if ('\t' == character) {
indent += 4;
} else {
indent += 1;
}
}
matcher.appendReplacement(buffer, String.format("--{%sem}", indent));
}
matcher.appendTail(buffer);
return buffer.toString();
}
private String indentLeadingSpace2(String text)
throws IOException {
Matcher matcher = PATTERN_LEADING_SPACES.matcher(text);
StringBuffer buffer = new StringBuffer(text.length());
while (matcher.find()) {
String group = matcher.group();
int indent = 0;
for (char character : group.toCharArray()) {
if ('\t' == character) {
indent += 4;
} else {
indent += 1;
}
}
StringBuilder indentReplace = new StringBuilder();
for (int i = 0; indent > i; i++) {
indentReplace.append("~");
}
matcher.appendReplacement(buffer, indentReplace.toString());
}
matcher.appendTail(buffer);
return buffer.toString();
}
private TextFlow indentLeadingSpace3(String text)
throws IOException {
TextFlow flow = new TextFlow();
int textStart = 0;
Matcher matcher = PATTERN_LEADING_SPACES.matcher(text);
while (matcher.find()) {
if (0 <= textStart) {
flow.addText(text.substring(textStart, matcher.start()), FONT_SIZE, COURIER);
textStart = matcher.end();
}
String group = matcher.group();
int indent = 0;
for (char character : group.toCharArray()) {
if ('\t' == character) {
indent += 4;
} else {
indent += 1;
}
}
flow.addMarkup(String.format("--{%sem}", indent), FONT_SIZE, COURIER,COURIER,COURIER,COURIER);
}
flow.addText(text.substring(textStart), FONT_SIZE, COURIER);
return flow;
}
public static void main(String[] args) {
AAAALayoutTest test = new AAAALayoutTest();
test.run();
}
}
Any update?
Sorry, I'm currently a bit overloaded with both private and business work. I'll try to have a look at this the next days
After all: is there any line wrapping needed in the preformatted part? I mean, do you need something that would be equal to the HTML <pre> tag, whereby tabs would be handled correctly?
Yes, wrapping would be needed, otherwise data could get lost.
Just to keep you informed: I'm working on the problem, hopefully there will be a release the next days.
:+1: I'm so eager to get rid of Jasper.
Ok, works in version 0.8.5. There were multiple problems.
addMarkup()
to add the text. This will interpret all kinds of characters as markup. Use addText()
instead ;-)Great news! Thank you.
When outputting code (java, or formatted markup like yaml), it would be nice to have a simple way to keep the leading white spaces.
I tried something similar to your algorithm
and this works, in a way (but is quite cumbersome) and ends up showing something slightly weird for my samples (some lines are "spaced" when they shouldn't):
It looks like newlines are inserted, but there are no newlines. I might be wrong, but this appears to be because TextSequenceUtil.wordWrap creates new instances of "Indent",
new Indent(indentation).toStyledText()
which are using default font and size. When remote debugging, the indentation seems to useFontDescriptor [font=PDType1Font Helvetica, size=11.0]
instead ofFontDescriptor [font=PDType1Font Courier, size=7.0]
from the text.