eerohele / saxon-gradle

A Gradle plugin for running XSLT transformations with Saxon
MIT License
16 stars 6 forks source link

Fix collection corruption in getIncludedStylesheets #20

Closed ndw closed 3 years ago

ndw commented 3 years ago

One of the first projects that I tried to use saxon-gradle on was the DocBook xslTNG stylesheets. I had mixed success including a couple of tasks that seemed to just hang. At the time, I attributed it to memory issues (they're big documents). I put the extension on the back burner and used JavaExec for those.

When I came back to it all yesterday, I worked through the bug about missing stylesheets and stumbled right back over the tasks that would hang. What, I wondered, is going on? I wanted to make sure it wasn't a Saxon bug (I work for Saxonica now 😄 ), so I instrumented my hacked up version of the extension to print out the Saxon arguments.

Except, they didn't print out. It was hanging before it reached the call to Saxon. (Ok, not a Saxon bug, yay!)

Curiosity aroused, I started unpeeling the layers. Eventually I got to this debugging statement in getInputFiles:

println("Stylesheets: ${stylesheets.size()}")

It didn't print. In fact, the call to size() was never returning.

That got me to take a closer look at getIncludedStylesheets:

  project.files(stylesheet) + (xslt.include + xslt.import).inject(stylesheets) { acc, i ->
                URI href = resolveUri(i.@href[0].toString())
                URI uri = stylesheet.toURI().resolve(href)
                acc + getIncludedStylesheets(new File(uri), acc)
            }

I believe that when run against a stylesheet that has multiple levels of include or import statements, that fold ends up mutating the stylesheets FileCollection while it's iterating over it. And I think that corrupts the collection. Whether Groovy should detect this or not, I'm not sure.

This PR rewrites the getIncludedStylesheets collection in a somewhat less clever way that avoids this problem.

I hope you'll accept this PR. If you do, could you please republish the plugin? I have dreams of getting this "small task I started on Friday evening" finished before Monday! 😁

eerohele commented 3 years ago

Ah. The joys of mutable collections.

I've merged your fix and cut v0.9.0-beta3 to unblock you. However, if you have a minimal reproduction for this issue, I'd like to take a look. Having a test for this fix would be very useful.

eerohele commented 3 years ago

a minimal reproduction

Although I guess "minimal" is the wrong word if you can only reproduce the issue with large builds... any way to reproduce the issue with reasonable effort would do.

ndw commented 3 years ago

Now that I've worked out what the issue is, I should be able to extract the example out of the larger build. Will get back to you this afternoon.

ndw commented 3 years ago

Okay, clone my version of the repo from https://github.com/ndw/saxon-gradle and check out the corruption branch. Running ./gradlew guide will hang the process. If you update the extension version to beta3, it runs. (You'll get a bunch of warnings and errors because I've stripped out a bunch of things, but you'll see that XSLT does process the file.)

Thanks for the quick turnaround. I might yet get this finished today! :-)

eerohele commented 3 years ago

Thanks! I'll see if I can get to it next week.