michal-h21 / make4ht

Build system for tex4ht
131 stars 15 forks source link

Special Characters Ubuntu and Windows -- Difference #122

Closed balakrishnan1978 closed 9 months ago

balakrishnan1978 commented 1 year ago

Dear Michal,

I have using special characters in LaTeX and working good in ubuntu and not working properly in windows (special characters are in junk in HTML). I'm not identify the problem and could you please check where is problems?

UBUNTU TEXLIVE Version : (TeX Live 2023) (preloaded format=latex 2023.5.14) WINDOW TEXLIVE Version : (TeX Live 2023) (preloaded format=latex 2023.5.14)

I have attached all the required files here.

Just remove .txt and rename last _ to . (because not supporting .tex, .html in github). Example (test_ubuntu_html.txt rename to test_ubuntu.html).

test_ubuntu_log.txt test_ubuntu_html.txt test_ubuntu_tex.txt test_windows_log.txt test_windows_html.txt test_windows_tex.txt

ThiloteE commented 1 year ago

If all else fails, you could try the method of half splitting to debug your files:

https://ljackso.medium.com/half-splitting-applying-a-troubleshooting-technique-to-debugging-code-6a0578d1833c https://www.ecmweb.com/maintenance-repair-operations/article/20889049/the-beauty-of-halfsplitting https://www.techrepublic.com/article/secrets-of-a-super-geek-use-half-splitting-to-solve-difficult-problems/

michal-h21 commented 1 year ago

I cannot test it on Windows, but I get a different result than you got on Ubuntu. Most notably, you have this HTML code:

<span 
class="cmmi-12">j < N </span>Laplacians and hypergraph Laplacians.

It should be this:

<span class="cmmi-12">j &lt; N </span>Laplacians and hypergraph Laplacians.
balakrishnan1978 commented 1 year ago

Dear Michal, I have pre-processing to change < and > to \lt and \gt.and output is generating correctly in ubuntu.. My main concern is Windows not generating special accent characters correctly. Thanks and Regards,bala 88707510677401588566

On Tuesday, May 16, 2023 at 07:07:48 PM GMT+5:30, Michal Hoftich ***@***.***> wrote:  

I cannot test it on Windows, but I get a different result than you got on Ubuntu. Most notably, you have this HTML code: <span class="cmmi-12">j < N Laplacians and hypergraph Laplacians.

It should be this: j < N Laplacians and hypergraph Laplacians.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

michal-h21 commented 1 year ago

I've just tried the example on Windows with TL 2023, and it compiled correctly. So it must be some issue with your setup. What kind of preprocessing do you do? This can be the culprit.

balakrishnan1978 commented 1 year ago

Dear Michal,

I also used Windows with TL 2023 only (This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) (preloaded format=latex 2023.5.14).

Windows Server 2016 Version 1607 (OS Build 14393.3866). But Output is different.

I have using PERL Script for pre-processing work and it's very simple only.

open(IN,"$ARGV[0]")|| die("Input File ::: $ARGV[0] is not found\n"); open(OUT,">TeX2HTM.tex")|| die("Output File is cannot be created\n"); $/=chr(26); $TeXin=; $FilePath=$1 if ($ARGV[0]=~m/(.)\(.).tex/si); $lenFilePath=length($FilePath); if ($lenFilePath > 0) { $FileName=$2 if($ARGV[0]=~m/(.)\(.?).tex/si);

print "FilePath1 ::: $FilePath\nFileName ::: $FileName\n";

} else { $cwd=~s/\//\/g; $FilePath=$cwd; $FileName=$1 if($ARGV[0]=~m/(.*).tex/si);

print "FilePath2 ::: $FilePath\nFileName ::: $FileName\n";

}

$TeXin=~s/\ \n/\n/g; $TeXin=~s/{\it }/ /g; $TeXin=~s/{\bf }/ /g; $TeXin=~s/{\sf }/ /g; $TeXin=~s/{\tt }/ /g;

$TeXin=~s/\textit{ }/ /g; $TeXin=~s/\textbf{ }/ /g; $TeXin=~s/\textsf{ }/ /g;

$TeXin=~s/\bf{/\textbf{/g; $TeXin=~s/{\bf (.?)}/\textbf{$1}/g; $TeXin=~s/\it{/\textit{/g; $TeXin=~s/{\it (.?)}/\textit{$1}/g;

$TeXin=~s/\</{\lt}/g; $TeXin=~s/>/{\gt}/g; $TeXin=~s/\ \ /\ /g;

$TeXin=~s/\begin{document}/\def\lt{\<}\n\def\gt{>}\n\begin{document}/si;

print OUT $TeXin; close(OUT); close(IN);

michal-h21 commented 1 year ago

I've tried the pre-processing script, and it seems that it produces almost identical result as the original TeX file. But I couldn't use it on Windows. So there must be a different issue. Unfortunately, I've ran out of ideas what could cause that. Could you try to compile this file on a different computer, just with the default version of make4ht in TeX Live 2023?

balakrishnan1978 commented 1 year ago

Dear Michal,

I'll check and get back to you further.

But Ubuntu and Windows both or Same Version on my system (TeXLive 2023, preloaded format=latex 2023.5.14, Make4HT Version 0.3m). I think it's only unicode.4hf problem and not sure. I'll check further.

Thanks for your help and support.

michal-h21 commented 1 year ago

Which unicode.4ht is used on your Windows computer? You can see it with the -a debug option. It should be .../2023/texmf-dist/tex4ht/ht-fonts/mozilla/charset/unicode.4hf, I think. Is it possible that a version that translates output to a 8-bit encoding is used? What options do you use for tex4ht? Can you post the terminal output? This is the important part:

[INFO]    mkutils: executing: tex4ht  -cmozhtf -utf8 "sample.dvi"  
----------------------------                                                                   
tex4ht.c (2018-07-03-10:36 kpathsea)                                                           
tex4ht -cmozhtf                                                                                
  -utf8                                                                                        
  sample.dvi                                                                                   
(/usr/local/texlive/2023/texmf-dist/tex4ht/base/unix/tex4ht.env)      
(/usr/local/texlive/2023/texmf-dist/tex4ht/ht-fonts/mozilla/charset/unicode.4hf)
balakrishnan1978 commented 1 year ago

Dear Michal,

It's not showing as per your expectation result above. Please find attached the tex4ht-xelatex-screenshot.png and tex4ht-latex-screenshot.png for your reference.

For LaTeX I have using : texlua d:\texlive\2023\texmf-dist\scripts\make4ht\make4ht -u -c neweq.cfg -a debug CMOTeX2HTM.tex 'fn-in' '' '-p'"

tex4ht-latex-screenshot

For XeLaTeX I have using : texlua d:\texlive\2023\texmf-dist\scripts\make4ht\make4ht -ux -c neweq.cfg -a debug CMOTeX2HTM.tex 'fn-in' '' '-p'"

tex4ht-xelatex-screenshot

In my neweq.cfg i have defined \Preamble{xhtml,mathml}. Any ideas?

michal-h21 commented 1 year ago

I was finally able to reproduce the issue. It seems to be caused by the use of single quotes for passing arguments to make4ht. I am not sure why it happens, it could be some Windows stuff. Anyway, as a workaround, use double quotes, so something like:

  $ make4ht -u -c neweq.cfg -a debug CMOTeX2HTM.tex "fn-in" "" "-p"