googleads / googleads-java-lib

Google Ad Manager SOAP API Client Library for Java
Apache License 2.0
226 stars 360 forks source link

Invalid XML character error #221

Closed borisenko-i closed 3 years ago

borisenko-i commented 3 years ago

Hi team!

We are observing the following error while performing a request to AdGroupAdService via the SDK:

An invalid XML character (Unicode: 0xb) was found in the element content of the document.

Looks like the service is returning the symbol 0xb that causes XML deserializer to fail. Here's the SDK-related part of the stack trace. Is there a way to handle this on our side without making changes to the data?

Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 4743; An invalid XML character (Unicode: 0xb) was found in the element content of the document.
    at org.apache.axis.AxisFault.makeFault(AxisFault.java:101)
    at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:701)
    at org.apache.axis.Message.getSOAPEnvelope(Message.java:435)
    at org.apache.axis.transport.http.HTTPSender.readFromSocket(HTTPSender.java:796)
    at org.apache.axis.transport.http.HTTPSender.invoke(HTTPSender.java:144)
    at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
    at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
    at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
    at org.apache.axis.client.AxisClient.invoke(AxisClient.java:165)
    at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
    at org.apache.axis.client.Call.invoke(Call.java:2767)
    at org.apache.axis.client.Call.invoke(Call.java:2443)
    at org.apache.axis.client.Call.invoke(Call.java:2366)
    at org.apache.axis.client.Call.invoke(Call.java:1812)
    at com.google.api.ads.adwords.axis.v201809.cm.AdGroupAdServiceSoapBindingStub.get(AdGroupAdServiceSoapBindingStub.java:1793)
    at sun.reflect.GeneratedMethodAccessor651.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.google.api.ads.common.lib.soap.SoapClientHandler.invoke(SoapClientHandler.java:100)
    at com.google.api.ads.common.lib.soap.axis.AxisHandler.invokeSoapCall(AxisHandler.java:234)
    at com.google.api.ads.common.lib.soap.SoapServiceClient.callSoapClient(SoapServiceClient.java:63)
    at com.google.api.ads.common.lib.soap.SoapServiceClient.invoke(SoapServiceClient.java:93)
    at com.sun.proxy.$Proxy70.get(Unknown Source)
    ... 15 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 4743; An invalid XML character (Unicode: 0xb) was found in the element content of the document.
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
    at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1472)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2923)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327)
    at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
    at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696)
    ... 39 more
nwbirnie commented 3 years ago

It looks like an invalid response payload is being received. I'm not 100% sure why from this stack trace though.

Do you possibly have a proxy that requires authentication? It might be that the API requests aren't being proxied correctly.

Does this happen on all services or just this one? Is it all requests to this service that fail or only some? If there's a specific request that reliably reproduces the issue could you share the request payload please?

Could you try enabling logging for the client library and share the logs if available please? I'd particularly like the request ID if it's available so I could check logs on our side. I'm guessing this might not work given the nature of the issue, but worth a try.

jradcliff commented 3 years ago

I've seen this before in rare cases where an AdGroupAd was created or modified in the UI or other interfaces, and that interface let an invalid character slip in.

If logging fails to give you the request ID, could you send the Selector you are using in the get call, as well as an ad group ID or ad ID expected in the result? That will allow us to investigate on our side.

Thanks

borisenko-i commented 3 years ago

@nwbirnie @jradcliff thanks for the quick responses!

Does this happen on all services or just this one?

We've encountered the same errors before with other services.

I'd particularly like the request ID if it's available

We're investigating this issue on behalf of our customer, and unfortunately don't have their consent to share anything related to their data yet. I'll request an approval to do so, but I can't guarantee that we'll have it.

In the meantime, maybe you have any ideas on how we can resolve this without looking into the customer's data? The source of this particular issue seems pretty clear - an XML-incompatible symbol (0xb) is present in the data returned from AdGroupAdService, which makes the parser to fail. Probably yes, this could be caused by changing the data via the UI.

nwbirnie commented 3 years ago

AdGroupAdService may provide the raw ad creative (which we may have subsequently been disapproved for policy reasons).

Could you try to reproduce the issue with the Google Ads API? If you can't reproduce (with the same request + account ID) then you can migrate this call over to the Google Ads API which supports deserializing these characters.

borisenko-i commented 3 years ago

you can migrate this call over to the Google Ads API

@nwbirnie we are currently migrating our code to the Google Ads API, but this work is still in progress and it will take some time to completely roll it out. I did a test run on the Google Ads API, and it seems to work fine, but we'd like to find a quick solution with the AdWords API if possible.

Could you try enabling logging for the client library and share the logs if available please? I'd particularly like the request ID if it's available so I could check logs on our side.

Here's the log output related to this issue. Unfortunately, the request ID is null.

nwbirnie commented 3 years ago

This is being caused by the description1 field in:

Could you possibly remove the \u00B character from the description? Or alternatively filter this ad out of your query?

nwbirnie commented 3 years ago

Code to reproduce

borisenko-i commented 3 years ago

Thank you @nwbirnie, I'll communicate this to our customer who owns the data.

nwbirnie commented 3 years ago

No problem, thanks for reporting. I'll close for now, but please feel free to loop back if there's further issues.