receipt-print-hq / escpos-tools

Utilities to read ESC/POS print data
MIT License
197 stars 72 forks source link

esc2html encoding #61

Open jkalousek opened 5 years ago

jkalousek commented 5 years ago

I'm trying to convert my receipt to html just to be able to show rough version of receipt before/after printing. I'm using basically stock code from example but I have problems with diacritics řčúěžý... and some symbols like × for example instead of:

      $printer->text("  Výrobky s alkoholem na tomto dokladu");
      $printer->feed();
      $printer->text("     nejsou určeny k dalšímu prodeji,");
      $printer->feed();
      $printer->text("    ale výhradně ke konečné spotřebě.");

I get

  V∞robky s alkoholem na tomto dokladu
     nejsou urƒeny k dalτímu prodeji,
    ale v∞hradn╪ ke koneƒné spot²eb╪.

Text is printing fine on my printer. I hope that I'm using dummy connector correctly:

  $connector = new Mike42\Escpos\PrintConnectors\DummyPrintConnector();
  $profile = Mike42\Escpos\CapabilityProfile::load("default");
  $printer = new Mike42\Escpos\Printer($connector, $profile);

I did try to add iconv('CP437', 'UTF-8... to:

if ($cmd -> isAvailableAs('TextContainer')) {
        // Add text to line
        // TODO could decode text properly from legacy code page to UTF-8 here.
        $spanContentText = $cmd -> getText();
        $lineHtml .= span($formatting, $spanContentText);
    }

But that just produced another garbled text instead of special symbols. Maybe I overlooked something in documentation. Do you have any suggestion?

mike42 commented 4 years ago

Apologies for the slow response on this.

Your usage is correct, but non-default character encodings aren't effectively supported by these tools at the moment. The default encoding (0) is CP437, and this works fine. There would be ESC t commands in your receipt which switch to different code pages at different points of the text, and we need to look them up and figure out how to convert each character back into UTF-8. A complicating factor is that the mapping of code page numbers to encodings depends on which type of printer the receipt was created for.

The CP437 to UTF-8 conversion is performed in TextCmd.php, which would be removed once this is implemented.

I've marked this as an enhancement, because we don't seem to be tracking this yet.

jkalousek commented 4 years ago

My workaround was simply:

$search = explode(",","∞,ª,²,¼,Θ,º,╪,ƒ,₧,σ,à,τ,ñ");
$replace = explode(",","ý,Ž,ř,Č,Ú,ž,ě,č,×,ň,ů,š,€");

and

if ($cmd -> isAvailableAs('TextContainer')) {
        // Add text to line
        // TODO could decode text properly from legacy code page to UTF-8 here.
        $spanContentText = $cmd -> getText();
        $spanContentText = str_replace($search, $replace, $spanContentText);
        $lineHtml .= span($formatting, $spanContentText);
    }

I only remapped letters and symbols that I used so far. Right now it is easiest way for me to "fix" this issue.

PratikBodawala commented 4 years ago

I am facing a similar issue, I need ½, ¼, and ¾ characters, which I found in latin1 (ISO-8859-1) charset.

My source file in latin1 encoding and default is CP437.

I am working on this and send you PR.