Embroidermodder / libembroidery

Library for reading/writing/manipulating machine and design embroidery files
https://www.libembroidery.org
zlib License
47 stars 14 forks source link

Format Encoding and Needed Work/Changes for 1.0 #180

Open tatarize opened 2 years ago

tatarize commented 2 years ago

I've written other embroidery input and output libraries for Python and Java. And considered this rather strongly over the years, ran into various problems with libembroidery itself and corrected them after considerable thought.


Encoding

There is a need for a centralized set of methods to slightly abstract the embroidery between reading and writing (eg. https://github.com/EmbroidePy/pyembroidery/blob/master/pyembroidery/EmbEncoder.py). Without doing this you have the formats being hit with different commands and in different orders and some of these are not accepted by one format vs. another format.

Sometimes the expectation is that you perform a stop, trim, then a series of jumps, other times the format may expect you to trim, jump, then stop for the color change. You can load from a format that has needle-set commands rather than color-change commands, and you need to know that the first needle-set there isn't a color-change. It's setting the first needle, not performing the first change of the thread.

Nuances

There is a lot of nuance in formats. The only way I found to capture this nuances and consistently transfer formats from one to another is through a neutral intermediate format. This also allows you to add in some higher level commands without needing each writer to deal with them on a case by case basis.

eg. If you want to do a TIE_OFF_THREE_STITCH to tie off a stitch block you can easily have those added to your list of stitches without messing with anything in the writers. Because it will always go through the encoder which will turn that into the required sequence.

Writer Expectation

The writers for an embroidery format need to specify what type of commands and the ordering they expect. So if sequins are in the incoming data this can be stripped. Or if the data has speed commands (slow and fast seen in Barudan .Uxy) these can be removed from the data before sending the data to a format that cannot write speed commands. The writers would also dictate that that DSTs have a max jump length of 12.1mm. Where and another might have no max jump length.

Al of these nuances are a lot easier from a centralized location rather than trying to have every writer account for every nuance that may arise depending on the source of the data.

Threadlist

There's a need to correct the threadlist information. If you load a .DST embroidery file with 5 colors, but these give no color information (DST lack overt color information unless they have extended headers), this will be treated differently than thread entries that have actual discrete based on known information.

Metadata

There's also metadata between the formats. Sometimes label, author, copyright, etc. for particular formats can be written in a format (eg. PES v 4+), and another reader may provide that information.

Consistency of using Encoding

You'll notice that some items like #170 (consistent end command) gets fixed pretty easily with this, because you'd no longer be feeding your raw stitches from one format into another format. You can have all jumps and trims and whether there's a color-change or not, gets called an end to the stitchblock or colorblock respectively. Whatever the output format may be, we can go ahead and have those elements converted correctly for that writer. The writers can be guaranteed they will get consistent orderings and data types.

Matrix

Writing a Matrix is really powerful and let's you apply affine transformations to code. E.g (https://github.com/EmbroidePy/pyembroidery/blob/master/pyembroidery/EmbMatrix.py). These abilities are really important when it comes to things like composing patterns together (#174), where you need to know where the next pattern should be relative to the previous pattern, or to perform basic changes on an embroidery like slightly resize it.

Commands

There's a few more commands that occur within Embroidery than are accounted for within this library. These are seen in various embroidery formats, and should be included into libembroidery to be effective.

NeedleSet

Some formats specifically .U01 (.UXY) and some others use a needle_set command rather than a color_change. This differs from color-changes in 2 distinct and important ways. First, it occurs before sewing begins, rather than after sewing ends. And secondly it includes the needle number explicitly. This needs to accounted for and permit the conversion between color-change (which is usually just STOP) and a needle_set. This tends to require knowing how many needles the machine has or taking an educated guess.

STOP vs. COLOR_CHANGE

There are some formats that let you imply stop without a color-change. This is used when applying applique. We stop but we're not changing threads. Sometimes this even involves a FRAME_EJECT which is a trim and then moving to the top of the design to give the user unfettered access to the embroidery, to add the applique or whatever they are doing.

Versions

139 is important. There needs to be given mechanisms for specifying versions of the file you're intending to write. For things like PES v 6 but also other methods of writing pes like pes without the pes block.

Or writing different versions of csv files. For example writing one that is actually correct to CSV specs vs. one that retains backwards compatibility. There are also very like to be dozens of potentially supported gcode formats #171.

Avoid Feature Creep.

A core library like this needs to just do the stuff it does in a well defined way. It doesn't need to do fill algorithms or anything outside of translations problems that can occur between different formats. If there is a need to have fills or higher level changes. That should be spun off into a different more powerful library that does that work.


You will notice that none of that is that extreme. There's a lot of little problems with the readers and writers and small bugs and edge conditions,

The biggest issues there are examples where the current descriptive language will not capture the nuance of all formats, and the need to generalize the format between reading and writing.

Additional features like composing two different files together get a lot easier, when this is done correctly. The bulk of the important work is understanding the formats themselves, which is mostly already done.

This is one part of the project that can be actually finished. A few isolated format fixes for a reader or writer would be the needed maintenance. Getting there requires that we get rid of the need for all writers need to check if they have an end command, or all writers need to deal with sequin stitches explicitly or this reader read this stuff verbatim but that's the wrong order for this writer.

tatarize commented 2 years ago

Suffice it to say, I don't believe any of the fill or the satin stuff, etc stuff belongs here. This should really focus on the Embroidery reading and writing, in a way that isn't infinitely complex or subject to feature creep.

You can really zero in on the main very low level reading and writing stuff make it rock solid.

If there's a need to have the fill algorithm stuff divided from the gui stuff it might be best to put that in a different library that could simply depend on this one, though it would be a library of fill-like commands to points which doesn't even need to know what embroidery formats are.

robin-swift commented 2 years ago

There's a lot here that is useful so I feel it's better placed in the documentation. Can you make a pull request on the Encoding header to the end of the Avoid Feature Creep section as part of the docs/ folder? I'll need to refer to details like this and the printer-friendly docs are going to be my first point of reference. You can just make a new file if you don't want to get into editing the TeX.

Also I find large blocks of san-serif, variable width fonts particularly hard to read (I lose my place a lot, probably mild visual dyslexia) so a printer-friendly version of your longest comments where I can change the formatting would be very helpful to me.

tatarize commented 2 years ago

I'm not sure it belongs in documentation. But, I went over it and removed some of the unneeded words and phrases and broke up the text into more relevant shorter sections with more clearly demarcated grammar.