pear2 / pear2.php.net

PEAR2 website
http://pear2.php.net/
Other
19 stars 9 forks source link

info.xml interpreted as text, not as XML. Needs more dynamic parser. #26

Closed addiks closed 11 years ago

addiks commented 11 years ago

When the REST component downloads a file, it checks the content-type and tries to parse it. For the content-type detection it simply checks if the provided content-type is exactly 'text/xml' or 'application/xml' for XML and anything else for text-based/unparsed. Problem is: this only works when the server sends exactly one of these two strings as content-type and nothing else.

When it tries to download 'http://packages.zendframework.com/rest/p/zend_http/info.xml' the given content-type sent by the server is (currently) 'text/html; charset=iso-8859-1,text/xml', which does not match one of the strings above and gets wrongly interpreted as text and not parsed to an XML object. To correctly handle such complex content-types, a more dynamic parser is needed.

The exact position of the problematic code is (currently) in '/Pyrus/REST.php' on lines 117-133 (in method 'retrieveData').

saltybeagle commented 11 years ago

This sounds like the server (packages.zendframework.com) isn't sending the correct content type.

addiks commented 11 years ago

Well, there are two possible solutions to this problem.

1) The first is that pyrus requires every server to send exactly the needed content-type being 'text/xml' and nothing else.

2) The seconds is that pyrus allows dynamic content-types.

I think that the second solution is the better one for multiple reasons: A) It's the standard. HTTP: http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7 B) This requirement is something few people will expect and therefore will cause lots of headache. C) Every proxy, firewall, gateway, router, ... might fiddle in the headers and cause additional problems. D) A single fix in the code is much easier and cleaner then every channel-maintainer fixing its channel. E) You will get a lot of channel-maintainers flooding the forums, irc and bug-trackers asking why their channel's doesn't work. (you need to debug the code in order to identify this as the cause that the channel is broken)

In contrast to that, the only benefit of the first solution i see here is that a developer might safe a few minutes/hours.

There might be some other reasons that i couldn't think of yet, but i think my point is clear that this really should be done.

[I could even do this myself if nobody has time for this, but first i want to make sure that the patch would be accepted and i might need some time fiddling myself into git (I'm used to svn but want to try git anyway) and how to develop for the pyrus core.]

addiks commented 11 years ago

There might be a third possible solution:

There could be an optional parameter to the 'retrieveData' method that allows the caller to enforce a specific parser to be used regardless of what the server sends. This would be easier to implement then a dynamic parser (although one would be a good thing anyway) and still solves the error.

The particular code that is causing the errors (In /Pyrus/Channel/RemotePackage.php) fails because it expects an XML object (line 450: "$info = $this->getPackageInfo($var);") and instead gets a string and triggers errors like 'Invalid string offset' (lines 486-490, all like "$pxml->channel = $info['c'];").

saltybeagle commented 11 years ago

Would you mind moving this issue to the Pyrus repository? This doesn't have anything to do with this repository (the pear2 website).