Html2xhtml is a command-line tool that converts HTML files to XHTML files. The path of the HTML input file can be provided as a command- line argument. If not, it is read from stdin.
Xhtml2xhtml tries always to generate valid XHTML files. It is able to correct many common errors in input HTML files without loose of infor‐ mation. However, for some errors, html2xhtml may decide to loose some information in order to generate a valid XHTML output. This can be avoided with the -e option, which allows html2xhtml to generate non- valid output in these cases.
Html2xhtml can generate the XHTML output compliant to one of the fol‐ lowing document types: XHTML 1.0 (Transitional, Strict and Frameset), XHTML 1.1, XHTML Basic and XHTML Mobile Profile.
For full information about how to run the program see doc/html2xhtml.txt in the source code distribution, the html2xhtml.txt file in the Windows binaries ZIP file or the html2xhtml manpage. Some examples are shown below.
cat input.html | html2xhtml
html2xhtml input.html
html2xhtml input.html > output.html
html2xhtml input.html -o output.html
html2xhtml input.html -t 1.1 -o output.html
Choose an output character encoding (by default, the program uses the character encoding detected in the input):
html2xhtml input.html --ocs utf-8 -o output.html
Get the list of available character sets:
./src/html2xhtml --lcs
Enter the main directory of the source distribution and type:
$ ./configure
$ make
You can run the test battery in order to check that the program is working as expected:
$ cd tests
$ ./test.sh
$ cd ..
If you want to install the program in your system, type then (it may require root priviledges):
$ make install
See ./INSTALL for more information.
The program has been tested to compile on GNU/Linux and MinGW in Windows. In MinGW the actual EXE file to use is the one the compiler creates inside src\.libs instead of the one in src. It depends on the libiconv-2.dll file, which is distributed with MinGW (inside the bin\ subdirectory of the main MinGW installation directory).
The source code in the Git repository does not include the files generated by the autotools. In order to build the ./configure script, run the following commands from the main directory of the sources:
$ aclocal
$ libtoolize
$ touch config.rpath
$ autoheader
$ automake --add-missing
$ autoconf
In OS X you need to use the glibtoolize command instead of libtoolize.
After that, you should get the ./configure script and proceed as explained above:
$ ./configure
$ make