Add new --batch option to read batch commands from stdin

jgoerzen commented 3 months ago

Makes it possible to generate more complicated multi-block layouts in the CLI, akin to the GUI. I started this because I wanted the QR code after, rather than before, the text.

For now, I define only text and qr commands. If the maintainers accept this patch, I will round it out with more commands.

maresb commented 3 months ago

This is a really cool proof of concept!!!

I'd be reluctant to actually build in support for a new label specification language. In the same spirit I would much rather refine the current internal representation so that it's serializable and deserializable. Then we could achieve the same functionality. Moreover, this would also make it easy to decouple the Labelle front-ends (CLI and GUI) from the local USB backend. For example, you could make a webserver that receives a label specification by a web API endpoint and then prints it.

What do you think?

jgoerzen commented 3 months ago

Part of what I wanted to do here is drive Labelle from a shell script that takes output from a Rust program. This makes it trivially easy to do that; the shell is limited in how it can do escaping, for instance, and this format requires no escaping. It also is very easy to parse, and very easy to generate from the shell; ie:

echo "TEXTSTART:$TEXT"

My use case here is generating labels using Library of Congress classification markings (on two lines) plus QR codes. By combining them like this, I save label tape, since I don't have to waste for each one. I previously had modified cli.py to put the QR code after the text; this avoids that modification, and lets me use multiple blocks at once.

There's nothing to say a later serialize/deseralize couldn't work also. This is just really, really simple, integrates with the shell, and can be fleshed out pretty easily too.

maresb commented 3 months ago

Ya, I really appreciate the simplicity of this format. We should think through it carefully though. Like probably we would need some escaping, e.g. newlines in QR codes. And would this be robust enough that we could extend to a full label specification language?

jgoerzen commented 3 months ago

Ahh, I hadn't thought of newlines in QR codes. That could work the same as the text blocks, I think (could add additional lines to it with the same mechanism). I'll push another commit to add that. But I don't think any other format supports newlines, so I don't think any escaping will actually prove necessary anywhere.

My thought here was to extend the CLI. The existing options in the device configuration, design, and label dimensions boxes would remain as CLI parameters only. The options in elements (except for barcode-with-text, which would be unnecessary since the batch mode would already support that) would be supported by the batch language.

Basically, think of it like lpr : I can pipe something to lpr -Pinkjet, with various options for paper size, etc. What I pipe to lpr might be plain text, or literally a PostScript program, etc.

maresb commented 3 months ago

I don't really like this XSTART: and XADD: syntax. Would it make sense to instead do something like this?

TEXT:Line1
NEWLINE
TEXT:Line2
TEXT:BIG

so NEWLINE appends to the previous element with a newline separator, and you raise an exception if the element type changes, e.g. the following would be invalid:

TEXT:Line1
NEWLINE
QR:Line2

jgoerzen commented 3 months ago

My preference would be to keep it similar to the way I initially did it. I believe it's more clear to the user, is easier to implement in Labelle, and doesn't introduce an exception case. If it would be clearer, instead of TEXTSTART and TEXTADD, it could be just TEXT and TEXTADD or something, so the one-line case would be equally-simple.

However... if you wouldn't merge that and would merge with your idea, I'll rework. I might suggest TEXT/TEXTNEWLINE and QR/QRNEWLINE which would be clearer. I would want to try to implement the whole thing without needing to generate an exception -- I think that would be possible.

maresb commented 3 months ago

Why do you find it desirable to not generate an exception? Unless there's some very good reason then exceptions should be raised for any invalid input. Why do you want to only warn on an invalid command? If a user makes a mistake in the label specification I would much rather it not print rather than waste tape. Also wouldn't the following be invalid under your scheme?

TEXTSTART:x
QRADD:y

I think TEXT/TEXTNEWLINE would be an improvement and reasonably self-documenting.

Would you be able to keep it extensible, for example if we want to be able to add barcodes in the future then we should avoid flush_both that would tie us into something binary, and instead use something like flush_all?

jgoerzen commented 3 months ago

Why do you find it desirable to not generate an exception?

A format that can't have an inconsistent state is better than a format that can have an inconsistent state. We can still have invalid command lines (ie, "XYZ:foo"), but as long as the input is syntactically correct, it should be fine.

How about a different approach: a new command PRINT or FLUSH or something. It would cause whatever has accumulated in the TEXT or QR buffers to be printed. A PRINT or FLUSH would be implied after barcodes (which can't be multiline) or when transitioning from one block to another. The example in the README could then read:

TEXT:FD12
TEXT:2013
QR:1234
TEXT:BIG
FLUSH
TEXT:LINE1
TEXT:LINE2
QR:12345

Come to think of it, I like this better than either of our proposals above. This method:

Has no inconsistent states
Easily handles the common case (one text block with an arbitrary number of lines, or one text block with an arbitrary number of lines alongside a barcode, qr code, or image)
Can still accommodate adjacent TEXT blocks by using FLUSH
Is clearer than having to explicitly continue with a NEWLINE

then we should avoid flush_both that would tie us into something binary, and instead use something like flush_all

I'm not a pedant on naming and would have no problem renaming that. However, I don't think any of the barcode formats support newlines so there would be no need for buffers for them. Just text and QRs support multilines if I remember correctly.

jgoerzen commented 3 months ago

Incidentally, to show off what I'm doing with this:

foo

I am organizing my books by Library of Congress Classification (LCC) code. Each book gets a label with the LCC as well as a QR code. The QR encodes the book's ISBN if present, or a locally-generated ID number if not. This hooks in with LibraryThing.

So I wrote a Rust program that takes the LCC and ISBN/number. It uses an algorithm to split the LCC across 2 lines in a way that produces the narrowest possible rendering (to make the labels as small as possible). It emits lines for labelle --batch that print the LCC and the QR.

I wrap that in a shell script that calls the Rust program in a loop, appending to the end of a batch file. When I've got enough to print a label, I Ctrl-C it, then run it through labelle --batch. I usually do about 4 books at a time. Then I cut the labels apart and apply them.

My shell script wrapper also detects if I'm adding the first or a subsequent label to the batch file. With a little sed, it prepends a space to both lines of the second and subsequent text blocks to add a little more space next to the previous QR code, to make cutting them apart easier.

All this I'm already doing.

I am using this flow as I am working with books that haven't already been added to my LibraryThing database. Most of my books have. When I get to those, I'll update the Rust program to be even easier. Using a TSV download of the LT database, I can query it by ISBN and obtain the LCC. So, using a cheap barcode scanner, I can just go down a line of books, scanning ISBNs, building up commands to print labels, and go. Very efficient. And by combining multiple books in one print, I avoid wasting so much space due to the hardware requirements of extra space at the start and end of the labels.

maresb commented 3 months ago

That's magnificent!!!

Does your device have a cutter like my LabelManager PnP? We add horizontal margins to account for the gap between the print head and the cutter, and that's effectively the minimum margin for single label printing.

I briefly looked into starting and stopping the print head for making precise cuts. Unfortunately I was getting a blank stripe where the print head paused. I don't think it's due to a quirk in the software but things are so messy that it's difficult to rule out completely.

maresb commented 3 months ago

A format that can't have an inconsistent state is better than a format that can have an inconsistent state.

Ya, I understand the theory behind the design principle. But in practice it only provides benefit if you are true to the underlying representation.

Fundamentally you are escaping newlines. Introducing new commands that effectively escape newlines doesn't change this. My concern is that a bunch of complexity is being pushed into the structure of the commands (either start/add, or flush). That's why I suggested newline, to be explicit that it's an escape character. I personally find both add and flush confusing and unintuitive because they seem to me like unnecessary control commands.

(Also, in terms of inconsistent states, what should happen when flush comes first? You could ignore it, but it still seems to me like something inconsistent.)

Could you articulate more what you find troublesome about NEWLINE as a literal escape sequence for a newline? Would it look better if the content continued on the same line as the newline?

I hope you don't think I'm just being difficult for the sake of it. I really like the idea and want to structure Labelle around it, so I want to be sure that we get it right. Thanks so much for all the feedback, please keep it coming.

jgoerzen commented 3 months ago

So let me begin by saying: I'm a very practical guy. It is more important to me to have this feature (being able to drive Labelle from a shell script) than the particular shape it takes. So if you as maintainers are saying, "we'll merge it only if you do it this other way", well then, I'll just do it the other way because it will soon be faster to just do that than to keep discussing it

So to introduce just a bare NEWLINE keyword, we must keep track of:

The current data accumulated in the buffer (prior line of text/QR)
What type of request the prior line was
Whether the prior command was NEWLINE
Then verify that the command after the NEWLINE matches the type of request we were processing, and append it to the buffer rather than immediately add it to the image

So the internal processing is more cumbersome.

Also in the case of someone needing to have a 3-line label with an empty line in the middle, it complicates things further, as two NEWLINEs in a row seem complicated. With the FLUSH idea, one could just:

TEXT:Line 1
TEXT:
TEXT:Line 3

Here one would have to:

TEXT:Line 1
NEWLINE
TEXT:
NEWLINE
TEXT:Line 3

Perhaps using CONTINUE rather than NEWLINE would make the sense of it more clear to the user (and to me). Effectively it is the opposite of FLUSH.

So with your idea I could see that example label being like this:

TEXT:FD12
CONTINUE
TEXT:2013
QR:1234
TEXT:BIG
TEXT:LINE1
CONTINUE
TEXT:LINE2
QR:12345

I suppose it would not have to be an error to have a QR follow a CONTINUE after a TEXT; the CONTINUE would simply be a no-op in that context. (Maybe print a warning; liberal in what you accept and all that)

I still think the other options (TEXTADD or FLUSH) are clearer for the user and simpler to implement in the program, but if you'd still prefer it this way (NEWLINE or CONTINUE), I'll send a new commit that changes it to work that way and that'll still meet my goals. Thanks for your openness to this!

maresb commented 3 months ago

Sorry for the delay, I'm on vacation so my availability is a bit spotty now.

I don't think I explained it adequately so I'll rewrite your examples with my latest NEWLINE proposal.

TEXT:Line 1
NEWLINE:
NEWLINE:Line 3

TEXT:FD12
NEWLINE:2013
QR:1234
TEXT:BIG
TEXT:LINE1
NEWLINE:LINE2
QR:12345

(assuming I didn't make a mistake, since I'm writing this out quickly.)

One advantage of this approach is that the number of commands (apart from NEWLINE) corresponds to the number of label elements. Also there is one command per label element.

In terms of implementation simplicity you only need a pointer to the previous content string, so in Python you could do something like:

commands: list[str] = []
contents: list[str] = []
for line in spec.splitlines():
    command, content = line.split(":", 1)
    if command == "NEWLINE":
        contents[-1] += "\n" + content
    else:
        commands.append(command)
        contents.append(content)
    assert len(commands) = len(contents)

Does this make sense? How complicated does it seem to you?

jgoerzen commented 3 months ago

Oh! Actually that is very similar to my original proposal, then. Instead of TEXTADD and QRADD, you have just a single keyword, NEWLINE. Sure, that would work fine. I would suggest naming it ADD instead of NEWLINE for clarity, but if that sounds good to you, I will modify this PR accordingly.

maresb commented 3 months ago

Could you explain in what way you find ADD clearer? To me it's very much the opposite. "Add" is a verb that takes two subjects: one thing gets added to another thing. Thus to disambiguate, wouldn't we need to specify ADD-THIS-LINE-TO-THE-PREVIOUS-LINE-WITH-A-NEWLINE-BETWEEN?

On the other hand, I don't see how NEWLINE might be misinterpreted, since "\nNEWLINE:" is literally the escape sequence for a newline in my proposed scheme.

In order to make the format extensible let's add a version line up top:

LABELLE-LABEL-SPEC-VERSION:1

Then I think we're ready to go.

jgoerzen commented 3 months ago

OK, I've pushed a new commit that adapts to the file format you specified. Thanks!

maresb commented 1 month ago

Sorry about the delay, this looks good to me.

labelle-org / labelle

Add new --batch option to read batch commands from stdin #72