theseer / phpdox

Documentation generator for PHP Code using standard technology (SRC, DOCBLOCK, XML and XSLT)
http://phpdox.de
Other
599 stars 121 forks source link

encoder error #160

Closed guckykv closed 9 years ago

guckykv commented 10 years ago

Tried latest version - installed via composer "theseer/phpdox": "*"

 [exec] [26.08.2014 - 18:37:10] Saving results to directory '/vagrant/xxx/yyy/build/phpdox/xml'
 [exec] 
 [exec] 
 [exec] Oups... phpDox encountered a problem and has terminated!
 [exec] 
 [exec] It most likely means you've found a bug, so please file a report for this
 [exec] and paste the following details and the stacktrace (if given) along:
 [exec] 
 [exec] PHP Version: 5.4.30-2+deb.sury.org~precise+1 (Linux)
 [exec] PHPDox Version: REL-3400-33-gb72f211-dirty
 [exec] Exception: TheSeer\fDOM\fDOMException (Code: 3)
 [exec] Location: /vagrant/xxx/yyy/vendor/theseer/fdomdocument/src/fDOMDocument.php (Line 234)
 [exec] 
 [exec] saving xml file failed
 [exec] 
 [exec] [XML-STRING] [Line: 0 - Column: 0] Fatal Error 6003: output conversion failed due to conv error, bytes 0xA0 0x7B 0x40 0x69
 [exec] [XML-STRING] [Line: 0 - Column: 0] Error 1544: encoder error
 [exec] 
 [exec] 
 [exec] #0 /vagrant/xxx/yyy/vendor/theseer/phpdox/src/Application.php(138): TheSeer\phpDox\Collector\Project->save()
 [exec] #1 /vagrant/xxx/yyy/vendor/theseer/phpdox/src/CLI.php(148): TheSeer\phpDox\Application->runCollector()
 [exec] #2 /vagrant/xxx/yyy/vendor/theseer/phpdox/composer/bin/phpdox(30): TheSeer\phpDox\CLI->run()
theseer commented 10 years ago

Please verify if the problem still remains with phpDox 0.7.0. In case it does, please reopen / add a comment to this issue.

guckykv commented 10 years ago

New try with current version

        "name": "theseer/phpdox",
        "version": "0.7.0",
            "url": "https://api.github.com/repos/theseer/phpdox/zipball/18ca0c645c5980d08edf44f701b2108f241b0425",

same result...

 [exec] [11.09.2014 - 16:09:58] Using config file '/vagrant/xxx/yyy/build/phpdox.xml'
 [exec] [11.09.2014 - 16:09:58] Registered collector backend 'parser'
 [exec] [11.09.2014 - 16:09:58] Registered enricher 'build'
 [exec] [11.09.2014 - 16:09:58] Registered enricher 'git'
 [exec] [11.09.2014 - 16:09:58] Registered enricher 'checkstyle'
 [exec] [11.09.2014 - 16:09:58] Registered enricher 'phpcs'
 [exec] [11.09.2014 - 16:09:58] Registered enricher 'pmd'
 [exec] [11.09.2014 - 16:09:58] Registered enricher 'phpunit'
 [exec] [11.09.2014 - 16:09:58] Registered enricher 'phploc'
 [exec] [11.09.2014 - 16:09:58] Registered output engine 'xml'
 [exec] [11.09.2014 - 16:09:58] Registered output engine 'html'
 [exec] [11.09.2014 - 16:09:58] Starting to process project 'phpdox'
 [exec] [11.09.2014 - 16:09:58] Starting collector
 [exec] [11.09.2014 - 16:09:58] Scanning directory '/vagrant/xxx/yyy/build/../src' for files to process

 [exec] [11.09.2014 - 16:10:23] Saving results to directory '/vagrant/xxx/yyy/build/phpdox/xml'
 [exec] 
 [exec] 
 [exec] Oups... phpDox encountered a problem and has terminated!
 [exec] 
 [exec] It most likely means you've found a bug, so please file a report for this
 [exec] and paste the following details and the stacktrace (if given) along:
 [exec] 
 [exec] PHP Version: 5.4.30-2+deb.sury.org~precise+1 (Linux)
 [exec] PHPDox Version: REL-3407-24-g4acf59b-dirty
 [exec] Exception: TheSeer\fDOM\fDOMException (Code: 3)
 [exec] Location: /vagrant/xxx/yyy/vendor/theseer/fdomdocument/src/fDOMDocument.php (Line 234)
 [exec] 
 [exec] saving xml file failed
 [exec] 
 [exec] [XML-STRING] [Line: 0 - Column: 0] Fatal Error 6003: output conversion failed due to conv error, bytes 0xA0 0x7B 0x40 0x69
 [exec] [XML-STRING] [Line: 0 - Column: 0] Error 1544: encoder error
 [exec] 
 [exec] 
 [exec] #0 /vagrant/xxx/yyy/vendor/theseer/phpdox/src/Application.php(138): TheSeer\phpDox\Collector\Project->save()
 [exec] #1 /vagrant/xxx/yyy/vendor/theseer/phpdox/src/CLI.php(148): TheSeer\phpDox\Application->runCollector()
 [exec] #2 /vagrant/xxx/yyy/vendor/theseer/phpdox/composer/bin/phpdox(30): TheSeer\phpDox\CLI->run()
theseer commented 10 years ago

I added a testcase for this issue based on the error message and the byte sequence libxml complains about. If I run a xmllint check on an xml file with that sequence i get the exact same error, which is correct and expected:

theseer@nyda ~/storage/php/phpdox/tests/data/issue160 master $ php xml.php | xmllint -

-:1: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xA0 0x7B 0x40 0x69
<?xml version="1.0" encoding="utf-8" ?><r>�{@i</r>

I fail to reproduce your problem with phpDox though as the problematic bytes should and get replaced:

theseer@nyda ~/storage/php/phpdox/tests/data/issue160 master $ phpdox -f test.xml
phpDox 0.7.0-4-gf9cdd46 - Copyright (C) 2010 - 2014 by Arne Blankerts

[11.09.2014 - 16:57:02] Using config file 'test.xml'
[11.09.2014 - 16:57:02] Registered collector backend 'parser'
[11.09.2014 - 16:57:02] Registered enricher 'build'
[11.09.2014 - 16:57:02] Registered enricher 'git'
[11.09.2014 - 16:57:02] Registered enricher 'checkstyle'
[11.09.2014 - 16:57:02] Registered enricher 'phpcs'
[11.09.2014 - 16:57:02] Registered enricher 'pmd'
[11.09.2014 - 16:57:02] Registered enricher 'phpunit'
[11.09.2014 - 16:57:02] Registered enricher 'phploc'
[11.09.2014 - 16:57:02] Registered output engine 'xml'
[11.09.2014 - 16:57:02] Registered output engine 'html'
[11.09.2014 - 16:57:02] Starting to process project 'phpDox-issue160'
[11.09.2014 - 16:57:02] Starting collector
[11.09.2014 - 16:57:02] Scanning directory '/home/theseer/storage/php/phpdox/tests/data/issue160/src' for files to process

.                                                   [1]

[11.09.2014 - 16:57:02] Saving results to directory '/home/theseer/storage/php/phpdox/tests/data/issue160/xml'
[11.09.2014 - 16:57:02] Resolving inheritance

.                                                   [1]

[11.09.2014 - 16:57:02] Collector process completed

[11.09.2014 - 16:57:02] Starting generator
[11.09.2014 - 16:57:02] Loading enrichers
[11.09.2014 - 16:57:02] Starting event loop.

.................                                   [17]

[11.09.2014 - 16:57:02] Generator process completed
[11.09.2014 - 16:57:02] Processing project 'phpDox-issue160' completed.

Time: 72 ms, Memory: 5.00Mb

Can you you verify if said testcase works or fails on your system? Would it be possible to get a (potentially stripped down) version of the file causing the crash on your system?

guckykv commented 10 years ago

Here it is:

$ php /tmp/xml.php | xmllint -
-:1: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xA0 0x7B 0x40 0x69
<?xml version="1.0" encoding="utf-8" ?><r>?{@i</r>
                                          ^
$ php /tmp/xml.php 
<?xml version="1.0" encoding="utf-8" ?><r>?{@i</r>
$ xmllint --version
xmllint: using libxml version 20708
   compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib 
$ php --version
PHP 5.4.30-2+deb.sury.org~precise+1 (cli) (built: Jul  2 2014 12:07:10) 
Copyright (c) 1997-2014 The PHP Group
Zend Engine v2.4.0, Copyright (c) 1998-2014 Zend Technologies
    with Xdebug v2.2.3, Copyright (c) 2002-2013, by Derick Rethans
$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION="Ubuntu 12.04.4 LTS"
theseer commented 10 years ago

Thank you for verifying and the version details. Can you please also check how phpdox handles the testcase by running:

theseer@nyda ~/storage/php/phpdox/tests/data/issue160 master $ phpdox -f test.xml

If that does not crash as it did with your codebase, I'd really like to have a (stripped down) version of the file that makes it crash.

guckykv commented 10 years ago

Too bad - it doesn't crash.

It's a big project where it failes. How can I find which file might be corrupt?

$ /vagrant/xxx/yyy/bin/phpdox -f test.xml
phpDox REL-3412-114-g8eb477a - Copyright (C) 2010 - 2014 by Arne Blankerts

[12.09.2014 - 16:22:49] Using config file 'test.xml'
[12.09.2014 - 16:22:49] Registered collector backend 'parser'
[12.09.2014 - 16:22:49] Registered enricher 'build'
[12.09.2014 - 16:22:49] Registered enricher 'git'
[12.09.2014 - 16:22:49] Registered enricher 'checkstyle'
[12.09.2014 - 16:22:49] Registered enricher 'phpcs'
[12.09.2014 - 16:22:49] Registered enricher 'pmd'
[12.09.2014 - 16:22:49] Registered enricher 'phpunit'
[12.09.2014 - 16:22:49] Registered enricher 'phploc'
[12.09.2014 - 16:22:49] Registered output engine 'xml'
[12.09.2014 - 16:22:49] Registered output engine 'html'
[12.09.2014 - 16:22:49] Starting to process project 'phpDox-issue160'
[12.09.2014 - 16:22:49] Starting collector
[12.09.2014 - 16:22:49] Scanning directory '/vagrant/phpdox/tests/data/issue160/src' for files to process

.                                                   [1]

[12.09.2014 - 16:22:49] Saving results to directory '/vagrant/phpdox/tests/data/issue160/xml'
[12.09.2014 - 16:22:49] Resolving inheritance

.                                                   [1]

[12.09.2014 - 16:22:49] Collector process completed

[12.09.2014 - 16:22:49] Starting generator
[12.09.2014 - 16:22:49] Loading enrichers
[12.09.2014 - 16:22:49] Starting event loop.

.................                                   [17]

[12.09.2014 - 16:22:50] Generator process completed
[12.09.2014 - 16:22:50] Processing project 'phpDox-issue160' completed.

Time: 480 ms, Memory: 7.75Mb
mattriverm commented 10 years ago

I was having the same problem but with another byte sequence. I was able to locate the files with

find . -type f -exec fgrep -aqs $'\x6C\x6A\x20' '{}' \; -print

It was a file with some Swedish characters in it.

theseer commented 10 years ago

@mattriverm Swedish charecters are part of UTF-8 last I checked thus shouldn't be a problem. Would it be possible to get the/a file which crashes phpDox (or a stripped down version)?

@guckykv Any updates? Does the for mentioned tip with find work for you?

mattriverm commented 10 years ago

Among others, it was this one

theseer commented 10 years ago

@mattriverm The file you mentioned is a JavaScript file which gets ignored by phpDox. It's thus very unlikely to cause any issues let alone a crash. I just cloned the repository and at least it's current state doesn't cause any issues on my system. Can you point me to a version/commit where it crashes?

guckykv commented 10 years ago

@theseer I've tried the find, but got 'fgrep: illegal byte sequence' while searching for the sequence phpdox complains about (0xA0 0x7B 0x40 0x69).

But today, I've got more info from phpdox:

 [exec] Saving XML to file '/vagrant/xxx/yyy/build/phpdox/xml/classes/xxxxxxxDataType.xml' failed

With this starting point I've found the illegal character:

/**
 * {@inheritdoc}
 */

The space in front of "{@" looks like a space (0xA0 - non breaking space), but wasn't one. After replacing it with a real space, phpdox can parse the whole project!

But in jenkins environment I've got a new error: See issue #166

Thanks!

theseer commented 10 years ago

Thank you for the feedback. I'll investigate this since 0xA0 should have gotten replaced before processing it.

theseer commented 10 years ago

Just to give you a heads up: I finally managed to reproduce this issue and am working on a fix.

drodil commented 10 years ago

I am facing the same problem but with a bit different codes:

[exec] Oups... phpDox encountered a problem and has terminated!
     [exec] 
     [exec] It most likely means you've found a bug, so please file a report for this
     [exec] and paste the following details and the stacktrace (if given) along:
     [exec] 
     [exec] PHP Version: 5.6.0-1+b1 (Linux)
     [exec] PHPDox Version: 0.7.0
     [exec] Exception: TheSeer\fDOM\fDOMException (Code: 3)
     [exec] Location: phar:///usr/local/bin/phpdox/fDOMDocument-1.5.0/TheSeer/fDOMDocument/fDOMDocument.php (Line 234)
     [exec] 
     [exec] saving xml file failed
     [exec] 
     [exec] [XML-STRING] [Line: 0 - Column: 0] Fatal Error 6003: output conversion failed due to conv error, bytes 0xBC 0x6D 0x27 0x20
     [exec] [XML-STRING] [Line: 0 - Column: 0] Error 1544: encoder error
     [exec] 
     [exec] 
     [exec] #0 phar:///usr/local/bin/phpdox/phpdox/Application.php(138): TheSeer\phpDox\Collector\Project->save()
     [exec] #1 phar:///usr/local/bin/phpdox/phpdox/CLI.php(148): TheSeer\phpDox\Application->runCollector()
     [exec] #2 /usr/local/bin/phpdox(460): TheSeer\phpDox\CLI->run()
drodil commented 10 years ago

My problem was solved by fixing comment from

    /**
     * This function is used to get selection list
     * The units array example array( 'nm' => array( 'unit' => 'nm', 'name' => 'nanometre', 'multiplier' => 0.000000001 ),
      'μm' => array( 'unit' => 'μm', 'name' => 'micrometre', 'multiplier' => 0.000001 ),
      'mm' => array( 'unit' => 'mm', 'name' => 'millimetre', 'multiplier' => 0.001 ),
      'm' => array( 'unit' => 'm', 'name' => 'metre', 'multiplier' => 1 ),
      'km' => array( 'unit' => 'km', 'name' => 'kilometre', 'multiplier' => 1000 ),
      'in' => array( 'unit' => 'in', 'name' => 'inch', 'multiplier' => 0.0254 ),
      'f' => array( 'unit' => 'f', 'name' => 'foot', 'multiplier' => 0.3048 ),
      'yd' => array( 'unit' => 'yd', 'name' => 'yard', 'multiplier' => 0.9144 ),
      'mile' => array( 'unit' => 'mile', 'name' => 'mile', 'multiplier' => 1609.344 ) );
     *
     * @param array  $units         This list of value
     * @param string $selected_unit This is seleted value
     *
     * @return string Selection list HTML
     */

To

    /**
     * This function is used to get selection list
     * The units array example array( 'nm' => array( 'unit' => 'nm', 'name' => 'nanometre', 'multiplier' => 0.000000001 ),
     * 'μm' => array( 'unit' => 'μm', 'name' => 'micrometre', 'multiplier' => 0.000001 ),
     * 'mm' => array( 'unit' => 'mm', 'name' => 'millimetre', 'multiplier' => 0.001 ),
     * 'm' => array( 'unit' => 'm', 'name' => 'metre', 'multiplier' => 1 ),
     * 'km' => array( 'unit' => 'km', 'name' => 'kilometre', 'multiplier' => 1000 ),
     * 'in' => array( 'unit' => 'in', 'name' => 'inch', 'multiplier' => 0.0254 ),
     * 'f' => array( 'unit' => 'f', 'name' => 'foot', 'multiplier' => 0.3048 ),
     * 'yd' => array( 'unit' => 'yd', 'name' => 'yard', 'multiplier' => 0.9144 ),
     * 'mile' => array( 'unit' => 'mile', 'name' => 'mile', 'multiplier' => 1609.344 ) );
     *
     * @param array  $units         This list of value
     * @param string $selected_unit This is seleted value
     *
     * @return string Selection list HTML
     */
theseer commented 9 years ago

This should be fixed with commits e27812001e82b88581be63e4ae4d629fbd2cb2ac and 6a5fad07cb0487f056fa221119d2df1def8744de for good.