gwtproject / gwt

GWT Open Source Project
http://www.gwtproject.org
1.51k stars 373 forks source link

StackTraceDeobfuscator sometimes produces total nonsense #9931

Open jnehlmeier opened 7 months ago

jnehlmeier commented 7 months ago

GWT version: head


Description

Once the application is large enough I think a hidden bug in GWTs symbol maps / source maps appears more and more often. It results in deobfuscated stack traces with incorrect file and line numbers.

Consider the following example:

at java.lang.Throwable.Throwable(com/google/common/base/MoreObjects.java:437)
    at java.lang.RuntimeException.RuntimeException(com/google/common/base/Preconditions.java:1422)
    at com.google.web.bindery.event.shared.UmbrellaException.UmbrellaException(UmbrellaException.java:142)
    at com.google.gwt.event.shared.UmbrellaException.UmbrellaException(com/google/common/base/Splitter.java:424)
    at com.google.gwt.user.client.ui.AttachDetachException.AttachDetachException(com/google/common/collect/RegularImmutableAsList.java:44)
    at com.google.gwt.user.client.ui.AttachDetachException.tryCommand(com/google/common/collect/RegularImmutableAsList.java:34)
    at com.google.gwt.user.client.ui.Panel.doDetachChildren(Panel.java:108)
    at com.google.gwt.user.client.ui.Widget.$onDetach(com/google/common/base/Joiner.java:276)
    at com.google.gwt.user.client.ui.Widget.onDetach(Widget.java:108)
    at com.google.gwt.user.client.ui.Composite.$onDetach(com/google/common/collect/Iterators.java:1433)
    at com.google.gwt.user.client.ui.Composite.onDetach(Composite.java:225)
    at com.google.gwt.user.client.ui.AttachDetachException$2.execute(AttachDetachException.java:249)
    at com.google.gwt.user.client.ui.AttachDetachException.tryCommand(com/google/common/collect/RegularImmutableAsList.java:48)

For some reason it often contains Guava classes as files for GWT classes but it can also contain other wrong files from the application itself or GWT SDK.

Steps to reproduce

Difficult to reproduce in an example project

Known workarounds

None

niloc132 commented 7 months ago

We're actually looking at updating the sourcemap implementation right now for other benefits (slightly faster linking, fewer tiny files on disk, and drop the mostly-unneeded old protobuf dependency in gwt-servlet), and can take a look at this at the same time if there is some way to reproduce this, or at least broken symbolMap/sourcemap files to work from? Those files plus the frame info (obf method name, file name, line and column numbers) should make it possible to see where the mismatch is, and hopefully reproduce and fix it.

I've never dug too deeply on this part, but it looks like the symbolmap is first used to rewrite from the obfuscated/generated js method name, then augment with sourcemaps. If I'm reading this right, the fact that you see qualified paths (com/google/common/base/Joiner.java etc) rather than just the bare file name (Widget.java or Joiner.java) suggests that you're not just getting the symbolmap data, as that reads after the last index of / in the uri: https://github.com/gwtproject/gwt/blob/d77e85c2a80d086e387d2413932499831c1718ea/user/src/com/google/gwt/core/server/StackTraceDeobfuscator.java#L257-L261

Instead, the bad path seems to be coming from sourcemaps. Any chance you're customizing the linker in any way, which might alter the offsets...?

jnehlmeier commented 7 months ago

Sadly I cannot share the code or symbolMap/sourcemap. It is also pretty large since a permutation is about 10MB. The above example was produced on iOS. I could try checking if it is always iOS/webkit or if other browsers also have the problem and if it is always using a qualified path if the filename is wrong.

At least I can say that no custom linker is used, just the default one of GWT.

First I thought it could be JsInliner but then some really wild combinations occurred, like the one above, that cannot be explained with inlining. It looks like a more fundamental problem either in GWT or in the app.

niloc132 commented 7 months ago

We don't need the full thing to analyze, see what might be wrong, just enough to accurately point the finger? Otherwise we're pretty limited to finding someone else who can share some files.

In theory, just the stack trace (with all the details, like the one you shared above except pre-deobfuscation), the relevant symbolmap lines (again, only maps to com.google.common and com.google.gwt classes and methods)... and the sourcemap info. That is a bit harder to decompose, but at least if you confirm the first two, you will know that symbolmaps are probably not to blame, and that we're looking at a sourcemap issue of some kind.

They certainly can be decomposed though - the java api in sourcemap-rebased.jar isn't too terrible to walk if you want to do it yourself. Alternatively, as I said in my other comment, we'll be updating that jar soon, and maybe the issue will resolve itself from some change to closure-compiler in the last ~8 years...

niloc132 commented 6 months ago

Any actual details you can share on this? Otherwise, consider pulling from https://github.com/gwtproject/gwt/pull/9936 (and leaving compiler.embedSourceMaps to false, since you aren't interested in that).

Also just in case, you don't think this is a dup of https://github.com/gwtproject/gwt/issues/9038? I'm assuming that you checked, but maybe you missed that one? Just in case, that would seem to add a constant offset somewhere in your stack, which could produce what you're seeing (but you could only confirm with access to the compiled JS). At a brief glance, that should only happen if you happen to be generating prod code as non-OBFUSCATED.

Given that only frames that are correct are those with no sourcemap details added, it seems plausible that this is the same issue? Can you confirm other compiler settings, etc, so we can hunt this down while we're working on sourcemaps?

jnehlmeier commented 6 months ago

Not really. Poking around takes too long at work since compiling the app is about 10-15 minutes and I actually forgot to check the issue tracker if only specific browsers are affected. I am basically waiting for an exception that can easily be reproduced and has a broken stack trace. Then I can checkout the source of that production build, modify the server and log both JS and deobfuscated stack traces using various browsers.

I can't remember how it works but I think I have never seen a broken stack trace during development using Chrome + SDM. In development exceptions are directly logged to browser console.

jnehlmeier commented 1 month ago

Hmm after having almost latest GWT in production with updated sourcemaps we received

java.lang.NullPointerException: Method name is null
    at java.base/java.util.Objects.requireNonNull(Objects.java:259)
    at java.base/java.lang.StackTraceElement.<init>(StackTraceElement.java:166)
    at java.base/java.lang.StackTraceElement.<init>(StackTraceElement.java:124)
    at com.google.gwt.core.server.StackTraceDeobfuscator.resymbolize(StackTraceDeobfuscator.java:317)
    at com.google.gwt.core.server.StackTraceDeobfuscator.resymbolize(StackTraceDeobfuscator.java:197)

which we never received before.

The code in question is

    // try to refine location via sourcemap
    if (sourceMapCapable && fragmentId != -1 && column != -1) {
      SourceMapping sourceMapping = loadSourceMap(strongName, fragmentId);
      if (sourceMapping != null && ste.getLineNumber() > -1) {
        OriginalMapping mappingForLine = sourceMapping
            .getMappingForLine(jsLineNumber, column);
        if (mappingForLine != null) {

          if (declaringClass == null || declaringClass.equals(ste.getClassName())) {
            declaringClass = mappingForLine.getOriginalFile();
            methodName = mappingForLine.getIdentifier().orElse(null);
          }
          fileName = mappingForLine.getOriginalFile();
          lineNumber = mappingForLine.getLineNumber();
        }
      }
    }

    if (declaringClass != null) {
      return new StackTraceElement(declaringClass, methodName, fileName, lineNumber);
    }

So if the outer if is entered, there is a possibility that

methodName = mappingForLine.getIdentifier().orElse(null);

sets methodName to null.

Sadly I do not see the original exception because of a bug on our side.

jnehlmeier commented 1 month ago

Looks like the orElse(null) was added in https://github.com/gwtproject/gwt/pull/9936

niloc132 commented 1 month ago

The .orElse(null) was added because the mappingForLine.getIdentifier() was an optional protobuf field, and I mistakenly recalled that optional string fields could return null - it looks like they instead return empty string, so we should replace null with "".

Can you share a little more about the use case here? It seems as though in this case you'll be getting an empty string for the method name/identifier despite the sourcemap not having information for this member, which means erasing the original symbol map contents that were retrieved already. It seems possible that this case is better served by keeping the old methodName value, rather than replacing with empty string. We also don't have any test case here, and I haven't been able to reproduce this in GWT itself or other downstream projects.

jnehlmeier commented 1 month ago

Can you share a little more about the use case here?

What exactly do you mean? It is just GWT's uncaught exception handler which sends exceptions via GWT-RPC to the server which then deobfuscates it before logging it. Nothing special. The only change we do is that we have a custom exception wrapper class for GWT-RPC because GWT's SerializbleThrowable does not handle suppressed exceptions and as a consequence multiple causes of UmbrellaException (because it stores the first as cause and the rest as suppressed).

I have never seen a deobfuscated stack trace with an empty method name so far with older GWT versions even though the code seems to imply that it could have happened. Maybe older GWT versions never produced a sourcemaps file with such a situation.

In any case the code should try to keep the information of symbol maps if corresponding data is unavailable via sourcemaps.

niloc132 commented 1 month ago

Sorry, what i mean is the use case where the methodName should be an empty string. The old code did not attempt to keep the symbol map name if the sourcemap had no name, so that isn't a change here.

I'm specifically asking what pattern of code you have that can lead to methodName being assigned to null, and seeking to confirm that prior to this patch, it was assigned to an empty string. If it is null now but wasn't an empty string before, that could point to a bug (either introduced or fixed) that we can address - but the old OriginalMapping type was clear that identifier was optional, and might not be provided.

With that information, we can see if it would make sense to instead use the symbol map data or not.

jnehlmeier commented 1 month ago

Sorry, what i mean is the use case where the methodName should be an empty string.

I don't think there is any use case. I think it is just an oversight because the old code did not return an optional so it was easy to miss that it could end up being empty. Java itself (Temurin) does only check for null because it uses a special StackTraceElement constant for serialization which does have an empty method name.

I'm specifically asking what pattern of code you have that can lead to methodName being assigned to null, and seeking to confirm that prior to this patch, it was assigned to an empty string. If it is null now but wasn't an empty string before, that could point to a bug (either introduced or fixed) that we can address - but the old OriginalMapping type was clear that identifier was optional, and might not be provided.

A bit tough to say, without seeing the real exception :) As said before I cannot remember any deobfuscated stack trace with empty method names. I have fixed some of our code so it will at least show the original exception now, but maybe we go a step further and temporarly change our GWT build to use the symbol map method name if sourcemaps does not have one (possibly marking it with some underscores or similar so we know the issue occurred).

jnehlmeier commented 1 month ago

After I modified our GWT fork to handle the NPE using some placeholder strings I now have received a new bug report that would have triggered the NPE.

I changed GWT to use

methodName = mappingForLine.getIdentifier().orElse(
  "method__" + methodName + "__from_symbolmaps_as_sourcemaps_methodname_was_null");

and received a stack frame that looks like

java.lang.Throwable: com.google.gwt.event.shared.UmbrellaException: Exception caught: (TypeError) : Cannot read properties of null (reading 'Vr')
    at java.lang.Throwable.Throwable(com/google/gwt/emul/java/lang/Throwable.java:114)
    at java.lang.RuntimeException.RuntimeException(com/google/gwt/core/client/CodeDownloadException.java:21)
    at com.google.web.bindery.event.shared.UmbrellaException.UmbrellaException(UmbrellaException.java:592)
    at Unknown.new R1(app-0.js@65:594)
    at com.google.gwt.event.shared.HandlerManager.$fireEvent(HandlerManager.java:574)
    at com.google.gwt.user.client.ui.Widget.$fireEvent(com/google/gwt/user/client/ui/Widget.java:164)
    at com.google.gwt.event.dom.client.DomEvent.fireNativeEvent(com/google/gwt/event/dom/client/ClickEvent.java:50)
    at com.google.gwt.user.client.ui.Widget.$onBrowserEvent(com/google/gwt/user/client/ui/Widget.java:204)
    at com.google.gwt.user.client.ui.Widget.onBrowserEvent(Widget.java:319)
    at com.google.gwt.user.client.DOM.dispatchEvent(DOM.java:890)
    at com.google.gwt.user.client.impl.DOMImplStandard.dispatchEvent(com/google/gwt/user/client/impl/EventMap.java:45)
    at com.google.gwt.core.client.impl.Impl.apply(com/google/gwt/core/client/impl/Impl.java:76)
    at com.google.gwt.core.client.impl.Impl.entry0(com/google/gwt/core/client/impl/Impl.java:115)
    at com/google/gwt/core/client/impl/Impl.java.method__null__from_symbolmaps_as_sourcemaps_methodname_was_null(com/google/gwt/core/client/impl/Impl.java:383)

// original JS exception
java.lang.Throwable: com.google.gwt.event.shared.UmbrellaException: Exception caught: (TypeError) : Cannot read properties of null (reading 'Vr')
    at Unknown.hL(app-0.js@60:334)
    at Unknown.xL(app-0.js@21:340)
    at Unknown.P1(app-0.js@125:592)
    at Unknown.new R1(app-0.js@65:594)
    at Unknown.Z0(app-0.js@121:574)
    at Unknown.$I(app-0.js@26:313)
    at Unknown.iZ(app-0.js@145:546)
    at Unknown.bJ(app-0.js@114:315)
    at Unknown.mJ(app-0.js@206:319)
    at Unknown.C1e(app-0.js@85:890)
    at Unknown.g3e(app-0.js@46:936)
    at Unknown.WM(app-0.js@29:401)
    at Unknown.ZM(app-0.js@57:404)
    at Unknown.<anonymous>(app-0.js@46:403)

// Code at line 403 (weird "var a" at the end). Because the issue was triggered 
// by click event this code is likely the event listener wrapped with $entry()
function YM(b) {
    VM();
    return function() {
        return ZM(b, this, arguments); 
        var a
    }
}

// ZM is Impl.entry0()
function ZM(b, c, d) {
    var e, f;
    e = XM();
    try {
        if (CL) {
            try {
                return WM(b, c, d)
            } catch (a) {
                a = dPe(a);
                if (h9(a, 91)) {
                    f = a;
                    dN(f, true);
                    return undefined
                } else
                    throw ePe(a)
            }
        } else {
            return WM(b, c, d)
        }
    } finally {
        $M(e)
    }
}

So anonymous functions are one situation that could trigger the NPE and Java line numbers might be messed up because of JSNI?

niloc132 commented 1 month ago

YM is entry() (well, the anonymous function wrapper) as you say: https://github.com/gwtproject/gwt/blob/4cc3679003051fe6ee5177c23959fadf3718f566/user/src/com/google/gwt/core/client/impl/Impl.java#L76-L87 The extra var looks like it is the hoisted _ from the else, and the JS optimizer is pretty limited, and just wasn't able to rewrite it away.

R1 is supposed to be the constructor of the com.google.gwt.event.shared.UmbrellaException, which then calls super().

I've debugged the old sourcemap impl, and the difference is not that it didn't return null here, but that it used to return null for the entire mappingForLine instance in some cases, and empty string for the mappingForLine.getIdentifier() in othres. Now we get a mappingForLine instance back from the sourcemap reader, but it just has an absent identifier. This suggests that to be consistent with the old code, we should just add a mappingForLine.getIdentifier().isPresent() check in the if that checked if mappingForLine itself was null. I'll still experimenting to make sure old and new are consistent in how they handle this, and will put this change up soon.

The LoggingRPCTest example trace also goes through Impl, but doesn't have any incorrect line numbers that I can see, either before or after this patch (except for the wiring of the split point, used here to make sure line numbers are accurate across split points).

I also found that browsers provide somewhat different stack traces than htmlunit does - should look into running real browsers for GWT's tests...


To return to the bug that this ticket was filed about... do you find that browser dev tools also have wrong mappings here? I wrote a small change to the Hello sample and tried to confirm that it had consistent Chrome and server side deobfuscator traces, and it seems to be roughly correct on both sides, at least not having issues to the degree you're seeing. If both are consistently wrong, it might suggest it isn't our deobfuscator that is faulty, but the sourcemap generation itself (plus or minus specifics of how your app is linking output and adjusting offsets), but if they are inconsistent, it suggests we're somehow reading the frames wrong. Still need a concrete example

niloc132 commented 1 month ago

Update on the above missing mappings: this won't work, the symbol mapper assumes that sourcemaps will be functional, so a slightly more nuanced approach is required. Follow up at https://github.com/gwtproject/gwt/issues/9996.


Update on the ticket itself: the sourcemaps are just fine, but the output JS in question apparently has extra newlines every once in a while. Joining those lines into how other compiled apps look results in sourcemaps that make sense, so something is going on in the late compilation process (generating output JS, linking, or something post-compiler) that is adding extra newlines to the generated JS output which other sample apps don't seem to experience.

jnehlmeier commented 1 month ago

After debugging the GWT compiler here is what is going on:

The SelectionScriptLinker has a feature that allows chunking the initial code fragment of a given CompilationResult(= permutation). See SelectionScriptLinker.generatePrimaryFragment() and SelectionScriptLinker.splitPrimaryJavaScript().

The method SelectionScriptLinker.splitPrimaryJavaScript() does have a check at the beginning which should prevent chunking if the permutation uses sourcemaps. The reason is that chunking adds newlines for each chunk but cannot update sourcemaps for now. The check walks over LinkerContext.getProperties() which returns all discovered SelectionPropertys, searches for property compiler.useSourceMaps and checks wether or not the property is true. In all other cases, including a null value, the code assumes sourcemaps is not used and will continue producing chunks.

A StandardSelectionProperty is created from a BindingProperty and the value is extraced using BindingProperty.getConstrainedValue(). This method returns null if the binding property does not have a single unique value.

The app that produces broken stack traces via sourcemaps had

CompilerParameters.gwt.xml (provided by GWT):

<!--
   Specifies whether a given permutation has source map support enabled.
-->
<define-property name="compiler.useSourceMaps" values="true,false"/>
<set-property name="compiler.useSourceMaps" value="false"/>

App.gwt.xml:

<!--
   Enable source maps for safari/chrome
-->
<set-property name="compiler.useSourceMaps" value="true">
  <when-property-is name="user.agent" value="safari" />
</set-property>

Using the above the BindingProperty for compiler.useSourceMaps now has two entries stored internally. One entry points to false and the other to true. Thus the values are not unique and BindingProperty.getConstrainedValue() returns null.

So even though the permutation is configured using source maps the compiler now converts the initial fragment to chunks. These chunks add newlines to the JS source code which in turn messes up the source map mappings.

An easy workaround is to allow source maps for all permutations, e.g.

<set-property name="compiler.useSourceMaps" value="true" />

In that case the BindingProperty only holds one value true.

So to reproduce the issue you have to

Then any stack trace that uses code defined in the initial fragment can have bad stack frames.