berteh / ScribusGenerator

Create beautiful documents with data. Open source pdf (and Scribus) template and mail-merge alternative.
http://berteh.github.io/ScribusGenerator/
MIT License
251 stars 36 forks source link

SG_NEXT-RECORD really difficult to use in complex templates #157

Closed garydale closed 5 months ago

garydale commented 4 years ago

My template contains a text box with %VAR_website% as its contents. The text box is a PDF annotation with http://%VAR_website% as its external website reference. The page contains two instances of the same design - the second moved down 90mm from the first.

After running the generator against the template, most of the substitution work but the %VAR_website% substitutions both show the second's information as both text and link target. I have to fix it by manually copying the correct information into both the link text and PDF annotation target.

Also, when I export the file as a PDF, there is no text displayed. If I click on where the text should be, the link works. I have verified that the text isn't just invisible - it's not there. I can't select it even when I use the Select All Text on Current Page option in Okular. I doubt this is a Generator fault however - since the first link is one I had to manually paste the data for.

BTW: I'm running Scribus 1.5.5 on a Debian/Testing system. The version of Generator is whatever I just downloaded from githyb.

garydale commented 4 years ago

OK, so the link text issue is with Scribus but the substitution problem isn't confined to links. I converted my template (painfully) to use a work-around for the link text problem (it involves using two text boxes - one for the display text and a second one overlaying it for the PDF link). When I did this I got a usable output file that I could export as a PDF and have the links both display and work.

Unfortunately the problem with the second record's information replacing the first's got worse. Not only did it happen on all the links but also on non-linked information.

The workaround is to stop use the more than one record on a page feature. It's not ideal but until you can get this fixed, it seems necessary.

garydale commented 4 years ago

I've found another problem. It happens fairly frequently. I have a text field that is 4mm wide that sometimes gets changed to 3.509mm wide. This causes it to be too small for the text it contains (a single letter [variable] followed by a colon [constant text]) so I have to manually adjust it.

I verified that the template fields (there are 12 of them that are this size) are all set to 4mm. It seems to be able to happen to any one of them. It may be a more general issue that I only notice in this case because it's the only time when the loss of half a millimeter is clearly visible.

garydale commented 4 years ago

I had another pile of information that I needed to import with alternating styles for each record - to make a "striped table" with one line per record. I made it as a large text box then copied blocks of two lines to fill the page. Each line looks like %VAR_clubname% %VAR_visitor% %VAR_date% %SG_NEXT-RECORD%

This almost works. What's going wrong is that %VAR_date% is being replaced by the date from the next record, not the current one. The last record picks up whatever garbage is in the buffer (I guess). The total substitution goes on for 3 pages and it's the same on each page. Things are good except the date is from the subsequent record.

Changing the lines to %SG_NEXT-RECORD%%VAR_clubname% %VAR_visitor% %VAR_date%
made a real mess of the table. Putting a space between the NEXT-RECORD and clubname fixed that and everything works.

I would expect the %SG_NEXT-RECORD% to be processed when it is found so at the end of one line or the beginning of the next line shouldn't matter. Also, the token is delimited by %'s so I don't know why you also seem to require a space between tokens for proper operation. Fortunately I note that the space is dropped from the substitution.

berteh commented 4 years ago

Hi Gary. Sorry to read the use of ScribusGenerator is not straightforward.

Indeed the use of %SG_NEXT-RECORD% is a bit tricky, as it's effects depend on the position of the other elements (text boxes, images,...) with respect to that word in the source (SLA XML) file.

This is particularly bothersome as the sequence of elements has normally no meaning in XML. And indeed grouping and ungrouping elements, for instance, has an impact on the XML sequence of elements, but nothing that can be seen in the Scribus view.

So I guess your %VAR_date% is actually located after the %SG_NEXT-RECORD% token in the SLA file, even though they are likely 'siblings' in XML (and Scribus) terms. As a workaround I'd suggest you edit (a copy of) your SLA with some text editor and move the page element with%SG_NEXT-RECORD% as the last child of its parent element... if that is not too scary for you.

I suspect deleting the tokenb element (paragraph?) in Scribus, and adding it again after everything else is ready, should have the same effect. For instance as a new line in your box (empty line should be removed automatically by the script).

Some walkthrough that helps coping with the not intuitive behaviour of my script in this regards: https://github.com/berteh/ScribusGenerator/wiki#cards-deck-walkthrough

If you find a better solution kindly let me know, as this is indeed not the best way to have this feature. Priority was on speed at the time of writing this plugin... and not using the embryo of Scribus API gave me a lot of speed and flexibility... but at the cost of this not satisfactory multiple records mechanism.

Hope this helps.

berteh commented 4 years ago

About the changing box size please open a new issue, for ease of tracking.

I had a seemingly similar problem once: the document settings and page settings where different, add I had started with a US letter format template and then changed page format to EU A4. Some boxes where then squeezed when the script had to create new pages, as it did not know which of letter or A4 was indeed the right size.

Making sure all document and pages size match solved the issue if I remember correctly.

garydale commented 4 years ago

The document is created from the template, which is just a single page. I never changed the size but some text boxes have their size and/or position slightly altered during the generator run. Because they are small fractions of a millimeter, you wouldn't normally see it unless you look at the properties. The same box would have slightly different sizes on some pages, which I noticed when "w:" showed as "w" with the text overflow marker appearing.

berteh commented 4 years ago

would you mind attaching your template and a dummy csv (can be the same line repeating all over) ?

garydale commented 4 years ago

I had to rename it as .txt clubs-template.txt

here's some test data - also renamed

clubs-test.csv.txt

I modified the template to use 5mm wide boxes instead of 4mm to accommodate the wider letters, but I still found some boxes had their widths changed. Also the main frame around the club's information often shifts up or down, as do other elements. I usually ignore the displacements because they are small.

berteh commented 4 years ago

I just run your example, thanks. Aside from the fact the important variables are missing from the csv (%VAR_officer_0_teltype_0%, %VAR_officer_0_teltype_2%, %VAR_officer_0_teltype_1%), i generated a single merged SLA where all small boxes where exactly 5mm high and wide... and empty aside from :

I did a second test by populating these variables:

officer_0_teltype_0 officer_0_teltype_1 officer_0_teltype_2
a w l
b x m
c y n
d z p

and it generated all pages OK without a glitch, and without changing the small boxes size... so I guess the problem lies not in the script.

I did see the small box getting smaller thought, to 4,68 x 5,00mm, on the one occasion where I resized it manually by mistake when I copied-pasted its content, it was just too small not to touch the sides. I don't know if Scribus provides the ability to lock the size of boxes, but if it does that might be a solution.

What my script does though is modify the position of boxes when you use the merge option, and in this case I noticed a slight vertical shift in some cases, generally increasing linearly. Had to do with margins and display units not interperted well, still not fixed, but it shouldn"t impact you for a few pages.

berteh commented 4 years ago

feel free to reopen if needed.

and of course, on the one occasion I tripped and reduced the box size (was for officer 0 teltype 0 in my case), that single box was reduced for all clubs, but showing only on those occasions where the type letter was a wide one (m or w for instance, no problem with i or l.)

garydale commented 4 years ago

You may need to run it for more records. As I said, it didn't happen often but I'm pretty sure I didn't nudge the box multiple times. The file I use had around 90 records and I noticed it a half-dozen times.

I've been doing a new template for a two-line striped table and even putting the %SG_NEXT-RECORD% at the start of the second and subsequent template sections didn't work. I'm getting the wrong information in some fields. This one is similar to the clubs template (in the officers section) except that there are something like 20 records to a page so reverting to one to a page (like I did for the clubs) is not all that workable.

berteh commented 4 years ago

Ok. I'll try with more records.

If it's okay with you I'd like to use your help to find a lasting solution to this 'next-record' problem. Are you comfortable enough with computers to edit the SLA manually in a text editor? and do you know a bit about XML (difference between attributes and elements?)

The generator script, for now, reads the source XML lines in sequence, and loads the next record date as soon as it sees the sg-next-record marker, and only then does the substitutions for the current element.

One way I had to fix really complex templates was to add some sign to the 'first' element of my new line (for instance a dummy Scribus attribute called 'iAmNext'), and making sure all the subsequent graphic elements with variables would be included in a 'child' page element. Grouping them does not do the trick as it is a mere XML attribute to Scribus, and has no impact on the XML order, which seems only related to the order in which you create the elements.

I would then create a new Scribus page object with the SG-NEXT-RECORD tag, save, and edit the sourceä code in a text editor to make sure this page object comes before any other variable, in the repeating object (by looking at the dummy 'iAmNext' attribute).

Back in Scribus, I would the duplicate and lay out the element to be repeated in the right order (very important). In case of alternating rows, for instance, I'm not even sure you could duplicate one odd and one even row simultaneously if you wanted to be sure of their sequence in the source code. Pitty, but it did work.

Once we are sure this works, maybe we can find a better way? My problem lies really in the way Scribus stores the XML page objects: their position in the source code should not be used, at all... But I found no other way to detect the order in which to repeat & substitute the elements in a multiple records scenario that was user friendly enough.

To get a grasp just imagine you want to turn your alternating LINES design into an alternating ROWS design. The current implementation would very well allow you to do that.. but with the same difficulties you are experiencing... That is, you would need to create the next record market, then completely design one row, then duplicate them in alternance... bothersome but works and is flexible.

Any better idea?

B

garydale commented 4 years ago

Sounds good. Yes, I frequently use KATE to edit .sla files (mainly to remove "Copy of" styles).

I actually use XML files but then have them converted to .csv files. It's fast and accurate - sometimes it finds problems with the XML that the XML validator doesn't. I rarely use attributes but I do know the difference.

Here's another template that I've tried as after my previous attempt for this template went down in flames. It has fewer problems but it's got one big one. The second record on a page vanishes except for the email address, which shows up in the PDF annotation over the email text (or at least where it should be, except I can't get Scribus to reliably position it within a big text box) for the second record, which otherwise contains information for the third record. All subsequent records on the page seem to be OK.

chairs-template.sla.txt

I modified the template to remove the bottom record on each page then shifted the remaining records (except for the first) down one spot. This would allow me to manually add the second record on each page. It also had the welcome side effect of fixing the email PDF annotation.

When I tightened up the spacing (see template below), I got some weirder behaviour. The first record was dropped and the second showed up twice. The subsequent records showed up with record pair reversed (e.g. record 3 showed up after record 4).

chairs-template.sla.txt

Here's yet another rendition of the template along with a test file that excludes the people's e-mail and telephone numbers (I've used dummy data that may be more useful). What's interesting is that the second record is actually the first one that should have been on the second page, while the second record is the second page should have been the first record on the third page....

I think all the records eventually turn up in the output document with all the correct information, even the right PDF annotations, but they are in some semi-random order. A lot of cut & paste in store to fix this one. :(

chairs-template.sla.txt chairs.csv.txt

OK. I seem to be missing one or two records per page.

berteh commented 4 years ago

difficult for me to say where your template got some problems... but it sure works fine (incl. pdf link annotations for 2 phones and mails, pictures, not loosing any data...):

chairs-template.B.csv.txt chairs-template.B.sla.txt chairs-template.Bdouble.sla.txt

I'll try to do a screencast on the topic, but basically my way of doing it is:

  1. create one item fully laidout
  2. add %SG_NEXT-RECORD% before the first element of this item, make it smaller if it's touching a text element so Scribus makes it a different (ITEXT) element entirely, that will be cleaned entirely by the script
  3. Check source code in XML editor to make sure SG_NEXT-RECORD is indeed located before any other variable in the source code (both in sequence AND in hierarchy, to be sure, siblings and children are OK, but no other variable should be in an ancestor of it.)
  4. Back in Scribus, group all elements of item (ctrl+g)
  5. Menu item > mutliple duplicate to duplicate as much as needed (even multiple rows)
  6. Retouch style if needed (eg ungroup to alternate colours, then regroup items separately)
  7. ungroup first element of page (ctrl+shif+g) to remove it's SG_NEXT-RECORD marker, as leaving it would skip the first record (of each page).

Run the generator script with appropriate data in "merge" mode, and enjoy the nice work result:

garydale commented 4 years ago

Well, I guess that's my problem - not checking it in an XML editor. ;) My method is a little different: 1) create the first item 2) copy & paste it then move it to the second position. 3) touch up the second item if needed. 4) add %SG_NEXT-RECORD% before the first element of the second item. 5) if I need more items (e.g. striped table) copy & paste the first two items then move them to the third & fourth positions. 6) add %SG_NEXT-RECORD% before the first element of the third item. 7) copy, paste & move items 3 & 4 as needed to fill the page.

Apart from not checking the XML, I also don't group and ungroup items. I just select the entire item or items. Rather than multi-dup, I copy & paste. Does grouping have an impact on how Scribus stores the items?

berteh commented 4 years ago

I just group them because it makes the layout of repeating elements a breeze with the "align and distribute" toolbox. Really nice things in there to either glue things with no spacing, or spacing them evenly, align, spread... you name it.

The only visible impact I had with grouping speaks actually against it: some group did not export to PDF, whereas ungrouping them makes them export alright. Discussion and example at http://forums.scribus.net/index.php/topic,3740.msg17653.html#msg17653

berteh commented 5 months ago

using SG_NEXT_RECORD is still not straightfoward, but unlikely to improve much in the future so I'll close this issue for now.

the best available doc is compiled at https://github.com/berteh/ScribusGenerator/wiki/How-to-use-%25SG_NEXT-RECORD%25 kindly contribute by improving it there if you'd like... and don't hesitate to get back in touch if you have a better idea how to handle all this mess !