.HUS format is mangling data

software-opal commented 6 years ago

G'day(and Happy New Year). I've been working with this program for a while trying to make a little pattern and noticed that for some formats my pattern is ... a bit mangled. I've uploaded a sample file and some scripts to try this stuff out in my fork(https://github.com/leesdolphin/Embroidermodder/blob/master/convert-roundabout.py#L109)

Basically converting a really simple 1cm box from CSV to HUS to CSV triples the height, doubles the width and all around messes with the way it should be.

Some formats are more susceptible than others, for example TAP handles the conversion fine; however PES decides that once it's done with a box-ish it'll drift off toward (40, 400).

The script expects to be placed in the root of the repo; and will do a build(qmake followed by make for libembroidery and libembroidery-convert) before running libembroidery-convert on the input file for each supported format. It'll then create a TXT file from the converted format and compare it to the original's TXT file by running diff in side-by-side mode.

For Example:

CSV -> CSV(adds a flag stitch for every conversion):

9                                                               |  10
                                                                >  0,0 color:0 flags:1
0,0 color:0 flags:1                                                0,0 color:0 flags:1
<snip>

CSV -> HUS(produces weird stitches with no apparent relation to the inputs)

9                                                               |  11
0,0 color:0 flags:1                                                0,0 color:0 flags:1
0,0 color:0 flags:1                                             |  -10.4,8.8 color:0 flags:0
0,0 color:0 flags:1                                             |  -21.4,-2.3 color:0 flags:0
0,0 color:0 flags:0                                             |  -25.6,-6.5 color:0 flags:0
0,10 color:0 flags:0                                            |  -28.2,-9.1 color:0 flags:0
10,10 color:0 flags:0                                           |  -35.8,-16.7 color:0 flags:0
10,0 color:0 flags:0                                            |  -23.1,-4 color:0 flags:0
0,0 color:0 flags:0                                             |  -23.1,-4 color:0 flags:0
0,0 color:0 flags:16                                            |  -23.1,-4 color:0 flags:0
                                                                >  -33.5,4.8 color:0 flags:0
                                                                >  -33.5,4.8 color:0 flags:16

JoshVarga commented 6 years ago

The height and width issue within HUS sounds like a real problem but most likely straightforward fix.

For the other issues mentioned could you also include the .CSV file that you used?

As for the weird stitches, I will have to check. There are a few things that happen internally which may cause the stitches for legitimate reasons. For example the maximum stitch lengths are different per format which would require adding jump stitches.

software-opal commented 6 years ago

The CSV file is this one here: https://github.com/leesdolphin/Embroidermodder/blob/master/line.csv

I'm quite happy to poke around with the code with a little bit of guidance as to what I should be poking.

tatarize commented 4 years ago

Well there is clearly a bug with husDecode. Initially the inputBuffer is always points to null and that will force the mStatus variable to be -1 rather than 0. Consequently this means that the routine husExpand_249 which is properly in deobfuscated form decode_c will first return 0 erroneously. This will return as the character being read and append to the front. Then be processed to add a 0 to the front of the ouput. And inputBuffer is never freed, so it won't be null next time. It will therefore let mStatus return 0 which will then skip appending a pointless 0 to the front.

The error compounds a bit as the whole routine is:

        if((_203 = (short) husExpand_249()) <= byte_MAX)
        {
            _278[_200] = (unsigned char) _203;
            if(++_200 >= _279)
            {
                _200 = 0;
                memcpy(&outputArray[outputPosition], _278, _279);
                outputPosition += _279;
            }
        }

deobfuscated:

            if ((c = decode_c()) <= UCHAR_MAX) {
                dec_text[r] = (unsigned char) c;
                if (++r >= DICSIZ) {
                    r = 0;
                    memcpy(&outputArray[outputPosition], dec_text, static_cast<size_t>(DICSIZ));
                    outputPosition += DICSIZ;
                }
            }

The extra returned char from decode_c() ticks up the r value which is constantly compared to DICSIZ the dictionary size. This r is always equal to r+1 and when the value gets above DICSIZ (1024, (_175 = (short) (1 << _269);, where _269 is equal to 10). I've seen a case where this poisoned the data with the wrong 0 value in position 0, when it was properly a 26. Though I was changing the ordering of the cmds, xs, and ys, and the bug should only really crop up for the commands which are generally lots of 80s and some other stuff.

But, this means it always reads an extra 0 for the commands at the front, which gets called a stitch, which is exactly the proscribed behaviors here. It could also manage to eat jumps or whatever since all the commands will be off by 1.

adds a flag stitch for every conversion

Embroidermodder / libembroidery

.HUS format is mangling data #124