Closed redjoe closed 2 years ago
MF3228 reportedly uses CARPS, not CAPT.
The carps-cups driver would likely be the place to start. The carps.txt file documents the compression format on known devices, hopefully the MF3228 won't be too different.
UPDATE: I just realised there is an existing issue on the carps-cups repo that concerns MF3228 support, I'll just link it so the others can find the really useful info you posted here: https://github.com/ondrej-zary/carps-cups/issues/15
I'm afraid I won't be able to help you much at this point, as I am not yet familiar with some of the techniques used in the compression. This issue has been closed as support for this device is beyond the scope of captdriver.
Maybe I understand what's going on with the white pages. I'm just going to assume that the uncompressed image is encoded like Netpbm P4, based on the assumption that MF3228 only does mono printing: one bit per pixel, eight pixels per byte
The white pages appear to be RLE-compressed:
Encoded in 295B 3384 == 998280B
Assuming that 00
is some indicator for RLE mode and 08 80
(2176) is a repeat count, the compressed version is encoded in
423B 2176 == 920448B which is somewhat close. There might be something else going on...
The 3384 line count might be the result of rounding to the next lower multiple of 8 or 4.
The data size is twice as large as the compressed A4 300dpi white page :grin:
A5 is 5.8x8.3 in == 1470x2490px raw 1350 2370 with 120px crop 1344 2370 with rounding to byte size == 168B 2370 == 398160B 295B 2370 == 641920B, close to 320960B * 2 (did you mean A5 600 dpi?)
Black pages seem to have very different starting bytes for different image sizes. Both black and white pixels are referenced with \xff
. Could there be some kind of dictionary in use? Could it be LZ77 (which can look like RLE in some cases)?
The black stripe pages seem to suggest that some kind of dictionary encoding is in use, which LZ family encoders are based on.
Earlier this year, I wrote a Python script sample_blots.py
that generates a bunch of patterns for studying RLE compression. I hope it helps here too...
Just be careful to avoid accidentally overwriting files with the script, the overwrite detection in the script is a little lacking :warning:
Data compression by CCITT Group 4.
I repeated your black page and white page experiments with a hand-coded SVG and an rsvg-convert
-GhostScript pipeline, and got similar but different results:
Black Page SVG :black_circle:
<?xml version='1.0' encoding='UTF-8' standalone='no' ?>
<svg width='210mm' height='297mm' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'>
<desc>Just a blank, black A4 page</desc>
<rect x='0' y='0' width='210mm' height='297mm' stroke='black' fill='black' />
</svg>
White Page SVG :white_circle:
<?xml version='1.0' encoding='UTF-8' standalone='no' ?>
<svg width='210mm' height='297mm' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'>
<desc>Just a blank, white A4 page</desc>
<rect x='0' y='0' width='210mm' height='297mm' stroke='white' fill='white' />
</svg>
My pipeline:
rsvg-convert -f pdf -o $PDF_FILE $SVG_FILE
gs -dSAFER -dNOPAUSE -dNOPROMPT -r 600 -SDEVICE=faxg4 -o $IMAGE_FILE $PDF_FILE
Try rsvg-convert -x 0.801 -y 0.801
if the resulting PDF has a larger page size than expected (version 2.40.2 needs this fix)
Results:
A4 600dpi Black Page: 26 a0 3e 03 81 af ff ... ff fe 0a
(1762 bytes)
A4 600dpi White Page: ff .. ff 80 0a
(879 bytes)
Repeated steps and add option page size -g4720x6768
.
gs -dSAFER -dNOPAUSE -dNOPROMPT -r600 -SDEVICE=faxg4 -g4720x6768 -o $IMAGE_FILE $PDF_FILE
I got a result 26 a0 3e 02 80 c9 ff ... ff f8
black page 600dpi.
Compare with implement driver 64 05 7c 40 01 93 ff ...
see difference bit numbering.
26 a0 3e 02 80 c9 -> 00100110 10100000 00111110 00000010 10000000 11001001
64 05 7c 40 01 93 -> 01100100 00000101 01111100 01000000 00000001 10010011
Try decompose result from GhostScript 26 a0 3e 02 80 c9
where MSB2LSB bit order:
|Horizontal Mode Coding
|--
| |a0a1, distance = 0 (White codes)
| |--------
| | |a1a2, coding length 2560 (Black codes)
| | |------------
| | | |a1a2, coding length 2112 (Black codes)
| | | |-------------
| | | | |a1a2, coding length 48 (Black codes)
| | | | |------------
00100110 10100000 00111110 00000010 10000000 11001001
2560 + 2112 + 48 = 4720px. Get code word length by link https://www.itu.int/rec/T-REC-T.6-198811-I/en or libtiff/t4.h.
For your example A4 600dpi Black Page 26 a0 3e 03 81 af
. I got width 2560 + 2368 + 39 = 4967px.
|Horizontal Mode Coding
|--
| |a0a1, distance = 0 (White codes)
| |--------
| | |a1a2, coding length 2560 (Black codes)
| | |------------
| | | |a1a2, coding length 2368 (Black codes)
| | | |-------------
| | | | |a1a2, coding length 39 (Black codes)
| | | | |------------
00100110 10100000 00111110 00000011 10000001 10101111
I got difference ending file from GhostScript. I didn't see end-of-facsimile block (EOF). The format if EOF 0000 0000 0001 0000 0000 0001
. Maybe faxg4:
Group 4 fax, with EOLs but no header or EOD.
A4 600dpi Black Page: 26 a0 3e 03 81 af ff ... ff f0
where width 4967px
A4 600dpi Black Page: 26 a0 3e 03 81 af ff ... ff f8
where width 4720px
A4 600dpi Black Page: 64 05 7c 40 01 93 ff … ff 1f 00 01 10
canon driver with LSB2MSB bit order
Just in case it matters, I was using GhostScript 9.26 from late 2018, but I doubt that makes much of a difference, unlike JPEG or other lossy compression codecs.
Looks like using the GS encoder as-is won't work, but it looks to me that the changes required won't be too difficult to implement. Or maybe I might have missed some option that enables LSB-first mode? (it would make things so easy if there was such a thing!)
I found parameter dFillOrder=2
stored in lower-order bits of the byte.
Pls help understand encoded data printer MF3228
White pages encoded:
ff ff ff … 00 08 80
A4 300dpi, length 426 bytesff ff ff … 00 08 80
A4 600dpi, length 846 bytesff ff ff … 00 08 80
A5 300dpi, length 298 bytesBlack fill pages:
64 05 74 a0 f8 ff … ff 01 10 00 01
A4 300dpi, length 854 bytes64 05 7c 40 01 93 ff … ff 1f 00 01 10
A4 600dpi, length 1701 bytes64 05 da 60 F5 ff … ff 01 10 00 01
A5 300dpi, length 598 bytesLeft vertical black line. Origin top left corner. All samples with format A4, 300 dpi. Width 2360px, height 3384px Offset of the printable area relative to the left side (the origin) is 5.165mm. Pixel width is approximately 0.085mm. I printed samples with the Inkscape.
example data for width 5.165mm, 1px
64 d5 ff … ff 0f 80 00 08
data length 1274 bytes64 d5
01100100 11010101
0f 80 00 08
64 fd
01100100 11111101
07 40 00 04
64 ed
01100100 11101101
07 40 00 04
64 f5
01100100 11110101
0f 80 00 08
64 e5
01100100 11100101
1f 00 01 10
64 a5
01100100 10100101
1f 00 01 10
64 c5
01100100 11000101
3f 00 02 20
64 45
01100100 01000101
7f 00 04 40
64 45 ef
01100100 01000101 11101111
7f 00 04 40
64 85 fc
01100100 10000101 11111100
00 08 80
64 85 fe
01100100 10000101 11111110
00 08 80
64 85 ff
01100100 10000101 11111111
00 08 80
64 05 66 b0 fe
01100100 00000101 01100110 10110000 11111110
01 10 00 01
64 05 16 20 f9
01100100 00000101 00010110 00100000 11111001
01 10 00 01
64 05 36 60 f2
01100100 00000101 00010110 00100000 11111001
02 20 00 02
64 05 6e 30 f3
01100100 00000101 01101110 00110000 11110011
03 20 00 02
64 05 04 fa
01100100 00000101 00000100 11111010
03 20 00 02