phpro / soap-client

A general purpose SOAP client for PHP
MIT License
855 stars 175 forks source link

SOAP response contains extra characters (headers) and can't be decoded #493

Closed mtx-z closed 9 months ago

mtx-z commented 9 months ago

Bug Report

| Version | 3.1.0 x Laravel 10 x PHP 8.2

Summary

Hello, I'm consuming a remote SOAP (SOAP_1_1) endpoint serving some e-commerce data. I cannot share the WDSL URL, and cannot edit the SOAP server configuration. The server correctly sends back the response, but some extra character around the XML envelope seems to prevent the decoding.

Current behavior

I'm sure that the response contains the result object I need. I edited the vendor vendor/php-soap/ext-soap-engine/src/ExtSoapDecoder.php to debug the $response->getPayload(). It gives me:

"

--uuid:4be97702-6215-4af9-9cee-fd25495b1d8d+id=871

Content-ID: <http://tempuri.org/0>

Content-Transfer-Encoding: 8bit

Content-Type: application/xop+xml;charset=utf-8;type=\"text/xml\"

<s:Envelope xmlns:s=\"http://schemas.xmlsoap.org/soap/envelope/\"><s:Body><GET_LIEUResponse xmlns=\"http://tempuri.org/\"><GET_LIEUResult><xs:schema id=\"NewDataSet\" xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" xmlns=\"\" xmlns:msdata=\"urn:schemas-microsoft-com:xml-msdata\"><xs:element name=\"NewDataSet\" msdata:IsDataSet=\"true\" msdata:UseCurrentLocale=\"true\"><xs:complexType><xs:choice minOccurs=\"0\" maxOccurs=\"unbounded\"><xs:element name=\"Lieux\"><xs:complexType><xs:sequence><xs:element name=\"catalogues_id\" msdata:DataType=\"System.Guid, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_id\" msdata:DataType=\"System.Guid, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_nom\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_descriptif\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_descriptif_data\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_logo\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_plan\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_ouvert\" type=\"xs:boolean\" minOccurs=\"0\"/><xs:element name=\"lieux_date_ouverture\" type=\"xs:dateTime\" minOccurs=\"0\"/><xs:element name=\"lieux_infos_ouverture\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_url_informations\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_url_reservation\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_latitude\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_longitude\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_format_keycard\" type=\"xs:string\" minOccurs=\"0\"/><xs:element name=\"lieux_actif\" type=\"xs:boolean\" minOccurs=\"0\"/></xs:sequence></xs:complexType></xs:element></xs:choice></xs:complexType></xs:element></xs:schema><diffgr:diffgram xmlns:diffgr=\"urn:schemas-microsoft-com:xml-diffgram-v1\" xmlns:msdata=\"urn:schemas-microsoft-com:xml-msdata\"><NewDataSet xmlns=\"\"><Lieux diffgr:id=\"Lieux1\" msdata:rowOrder=\"0\"><catalogues_id>c874ef73-5f16-44f3-9c5c-d6464812972f</catalogues_id><lieux_id>fdc765d0-51ab-4075-9a3d-fc01a922eaa0</lieux_id><lieux_nom>CINEPASS PATHE GAUMONT</lieux_nom><lieux_descriptif>https://xxxx.fr/lieu_descriptif.aspx?id=fdc765d0-51ab-4075-9a3d-fc01a922eaa0</lieux_descriptif><lieux_descriptif_data>&lt;p style=\"text-align: center;\"&gt;&amp;nbsp;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;strong&gt;&lt;span style=\"font-size: 22px; color: #ff0000;\"&gt;Path&amp;eacute; Gaumont France&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&amp;nbsp;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;img alt=\"\" src=\"https://xxxx.com/images/cinema/test/19_600x396.jpg\" /&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&amp;nbsp;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;span style=\"font-family: Arial; font-size: 13px; color: #000000;\"&gt;Pr&amp;eacute;sents dans plus de 45 villes avec 70 cin&amp;eacute;mas et&amp;nbsp; plus de 700 &amp;eacute;crans, &lt;/span&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;span style=\"font-family: Arial; font-size: 13px; color: #000000;\"&gt;les cin&amp;eacute;mas Path&amp;eacute; Gaumont sont leaders de l&amp;rsquo;exploitation cin&amp;eacute;matographique en France.&lt;/span&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;span style=\"font-family: Arial; font-size: 13px; color: #000000;\"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;span style=\"font-family: Arial; font-size: 13px; color: #000000;\"&gt;La strat&amp;eacute;gie de mont&amp;eacute;e en gamme et de modernisation des Cin&amp;eacute;mas Path&amp;eacute; Gaumont repose sur une politique active de cr&amp;eacute;ation, &lt;/span&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;span style=\"font-family: Arial; font-size: 13px; color: #000000;\"&gt;de reconstruction et de r&amp;eacute;novation, une innovation permanente avec les meilleures technologies &lt;/span&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;span style=\"font-family: Arial; font-size: 13px; color: #000000;\"&gt;(Imax, 4DX, ScreenX, Dolby Cin&amp;eacute;ma),&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&lt;span style=\"font-family: Arial; font-size: 13px; color: #000000;\"&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=\"font-size: 12px;\"&gt;des services in&amp;eacute;dits (num&amp;eacute;rotation des places) et adapt&amp;eacute;s et un parcours spectateur optimis&amp;eacute;, en salles et sur le digital&lt;/span&gt;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: center;\"&gt;&amp;nbsp;&lt;/p&gt;&#xD;
&lt;p style=\"text-align: left;\"&gt;&amp;copy;path&amp;eacute;gaumont&lt;/p&gt;</lieux_descriptif_data><lieux_logo>https://xxxx/image/lieu_xxxx-51ab-4075-9a3d-fc01a922eaa0_0_0_0_0_20231212014149.png</lieux_logo><lieux_plan/><lieux_ouvert>true</lieux_ouvert><lieux_infos_ouverture/><lieux_url_informations>https://www.cinemaspathegaumont.com/cinepass</lieux_url_informations><lieux_url_reservation/><lieux_latitude/><lieux_longitude/><lieux_actif>true</lieux_actif></Lieux></NewDataSet></diffgr:diffgram></GET_LIEUResult></GET_LIEUResponse></s:Body></s:Envelope>

--uuid:XXXXXXX-6215-4af9-9cee-fd25495b1d8d+id=871--

"

image

So the object I need is in the HTTP received response. Authentication, method mapping etc... are OK. Issue is when decoding the response.

Error

Phpro\SoapClient\Exception\SoapException {#6634 ▼ // app\Services\xx\ProductSync\JsonSyncLoop.php:170
  #message: "looks like we got no XML document"
  #code: 0
  #file: "C:\laragon\www\xxxxx\vendor\phpro\soap-client\src\Phpro\SoapClient\Exception\SoapException.php"
  #line: 19
  -previous: SoapFault {[#6638 ▶](http:/xxx.local/#sf-dump-1369496584-ref26638)}
  trace: {▶}

What I tried

I tried to manipulate the response using a custom middleware or response body manipulation, without success.

I don't know what I should do. Should I try again to edit the XML response to remove the special character? Or is there any "do not throw" parameter so I can get the complete response and parse it?

Note that I was using wsdl2phpgenerator before, and I could get a response from the same Soap server. I just tried with wsdl2phpgenerator classes, and I'm able to get the response object without an error being thrown).

if (isset($response->GET_LIEUResult)) {
            return (object) (array) simplexml_load_string($response->GET_LIEUResult->any)->NewDataSet->Lieux;
        }

Thank you for your help; and for the awesome package.

veewee commented 9 months ago

Hello @mtx-z,

At the moment, there is indeed no support for requests / responses in XOP format like e.g. MTOM services. It's something I surely want to support at some point in time.

There is an open discussion for this here: (which originates in 2017 or so ... to give you an idea) https://github.com/phpro/soap-client/discussions/357

About your comments:

I tried to manipulate the response using a custom middleware or response body manipulation, without success. I don't know what I should do. Should I try again to edit the XML response to remove the special character? Or is there any "do not throw" parameter so I can get the complete response and parse it?

If done the right way, this should be a valid (and suggested) workaround. In your guzzle middleware, you are not removing the multi-part boundaries. It's not a matter of encoding url characters of doing utf8 conversions, but a matter of removing those --uuid:xxxxx lines and only keeping the XML.

Of course, this will not solve the underlying problem of us not being able to support XOP payloads at the moment. I'm thinking more in line of solutions that can keep track of the "attachments" while only keeping the SOAP XML as an actual response. Maybe even with inline replacement of the xop:includes with the binary data directly.

Note that I was using wsdl2phpgenerator before, and I could get a response from the same Soap server. I just tried with wsdl2phpgenerator classes, and I'm able to get the response object without an error being thrown).

I don't know how wsdl2phpgenerator deals with this, but it's interesting to hear they supported this. If I find some more time, I can take a look at the codebase if I can figure out how they are doing it.

Going forward

Ad mentioned before: adding support for these kind of things would be very nice indeed! Currently I don't have any access nor experience with MTOM / XOP. It could help a lot if we could do an integration to gain some knowledge in this field.

Therefore I kindly link you to: https://github.com/php-soap/.github/blob/main/HELPING_OUT.md#let-us-do-your-implementation

If you want us to support this out of the box in our packages, we can always help you out with your implementation and transform those learnings to something that can generally be used by anyone who is facing the same issues as you are. If you just want to get around this issue in your project, that is fine as well of course :)

mtx-z commented 9 months ago

Hello @veewee,

thank you a lot for all those explanations and details. Unfortunately, I'm not sure that I have the required experience to create a PR for your package to support this.

But I could try to set up a middleware to strip the response from non-xml lines. I tested a simple regex that should do the work $pattern = "/<s:Envelope(.*)<\/s:Envelope>/s";.

I was able to edit the response to only keep the XML datas:

class SoapMiddleware implements Plugin
{
    public function handleRequest(RequestInterface $request, callable $next, callable $first): Promise
    {
        return $next($request)
            ->then(function (ResponseInterface $response): ResponseInterface {
                //extract content between <s:Envelope and </s:Envelope> including those tags
                $pattern = "/<s:Envelope(.*)<\/s:Envelope>/s";
                $body = $response->getBody()->getContents();
                preg_match($pattern, $body, $matches);
                $newBody = utf8_encode(urldecode($matches[0]));
                $streamBody = fopen('data://text/plain,' . $newBody,'r');//new resource from string

                $response = $response
                    ->withBody(new \GuzzleHttp\Psr7\Stream($streamBody));

                return (new XmlMessageManipulator)(
                    $response,
                    fn (Document $document) => $document->manipulate(
                        function($document) {
                            //dd('manipulate2', $document);
                        }
                    )
                );
            });
    }
}

But I'm getting a new error Cannot assign string to property App\Services\Sld\PhpRoSoap\Type\GETLIEUResult::$schema of type App\Services\Sld\PhpRoSoap\Type\Schema from ExtSoapDecoder. But at least it seems the data is now readable until a certain point. Any idea?

Thanks!

veewee commented 9 months ago

Hello,

Unfortunately, I'm not sure that I have the required experience to create a PR for your package to support this. But I could try to set up a middleware to strip the response from non-xml lines. I tested a simple regex that should do the work $pattern = "/<s:Envelope(.*)<\/s:Envelope>/s";.

That could work for you, but won't be sufficient to ship : MTOM is about adding multi-part binary files to the request / response in combination with a SOAP message. So we'll need some more advanced things to work with. But of course, it could work for your case.

But I'm getting a new error Cannot assign string to property App\Services\Sld\PhpRoSoap\Type\GETLIEUResult::$schema of type App\Services\Sld\PhpRoSoap\Type\Schema from ExtSoapDecoder . But at least it seems the data is now readable until a certain point. Any idea?

It's hard to tell from the given information about your service. But this might break/alter some namespaces making the decoder to not being able to map into the correct structure.

$newBody = utf8_encode(urldecode($matches[0]));
veewee commented 9 months ago

Closing this one : feedback was provided.