FasterXML / jackson-jaxrs-providers

Multi-module project that contains Jackson-based "old" JAX-RS (ones under `javax.ws.rs`) providers for JSON, XML, YAML, Smile, CBOR formats
Apache License 2.0
109 stars 77 forks source link

Backport Snakeyaml update to branch 2.12 #177

Closed janjwerner-confluent closed 6 months ago

janjwerner-confluent commented 7 months ago

Hi @cowtowncoder I'm working on CVE cleanup in Druid, where I ran into a wall with usage of Snakeyaml. As it was mentioned in https://github.com/FasterXML/jackson-jaxrs-providers/issues/134 Druid cannot easily move to newer version of of jaxrs-providers. Would you be willing to accept a PR and release another 2.12 branch?

cowtowncoder commented 7 months ago

I could accept a PR (if that made it easier from your side) but would not release a new version, too much work.

But I think that forcing a newer SnakeYAML version might work -- I don't think 2.12 was shading dependency, and SnakeYAML API has remained about same.

janjwerner-confluent commented 7 months ago

@cowtowncoder Thank you for the swift response. The forcing of a newer SnakeYAML version unfortunately does not work - the default constructor has changed and just overriding the dependency leads to a failure.

The required change is trivial:

-        _yamlParser = new ParserImpl(new StreamReader(reader));
+        _yamlParser = new ParserImpl(new StreamReader(reader), new LoaderOptions());

and adding an import: +import org.yaml.snakeyaml.LoaderOptions; in yaml/src/main/java/com/fasterxml/jackson/dataformat/yaml/YAMLParser.java

cowtowncoder commented 7 months ago

Ok that is unfortunately a no go then since:

  1. Would cause compatibility problem in patch (since that new parser constructor not available in SnakeYAML version supported by earlier 2.12.x releases)
  2. Requires full release (since more than 1 component), which means 3-4 hour release process

Sorry.

cowtowncoder commented 7 months ago

Having said that, then overriding jackson-dataformat-yaml dependency to newer one might work. But I guess that could be a rat hole as newer version of YAML format module probably does then require newer jackson-core and perhaps jackson-databind. So it depends on exactly what prevents use of later JAX-RS provider version.

janjwerner-confluent commented 7 months ago

Thank you for feedback and the consideration!

pjfanning commented 7 months ago

Users can simply copy paste code and modify it. There is no reason for Jackson to issue new releases for every minor release every time a dependency jar has a CVE.

cowtowncoder commented 7 months ago

I do think it's a PITA to maintain a fork -- users typically build with dependency jars from Maven repositories (or similar), except for few that build all from sources -- so I think it was a reasonable thing to ask if this was feasible.

But yes, building new patches for old branches would be potentially tons of work for Jackson project, unfortunately.

janjwerner-confluent commented 7 months ago

@pjfanning I agree and where possible, I would try to attempt to upgrade the dependency that brings the vulnerable component, or override the transitive dependency. In this particular case, first solution is very labor intensive, second solution is breaking due to incompatibile API. I had to ask - the worst thing that can happen is someone telling you no :)

pjfanning commented 7 months ago

Is Druid still using Jersey 1.x? That is the reason Hadoop is stuck on Jackson 2.12. Jackson 2.13 removed the support for Jersey 1.x. If Druid does not use Jersey 1.x, you could probably upgrade to a much newer version of Jackson.

I offered to help get Hadoop upgraded - including doing a fork of the jackson-jaxrs libs that added back Jerset 1.x support but they weren't interested.

janjwerner-confluent commented 7 months ago

Unfortunately, yes: 1.19.4 https://github.com/apache/druid/blob/e68979e03be2ff6b453aa5a5e761a677e64ff041/pom.xml#L101

cowtowncoder commented 7 months ago

Hmmmmh. Oh. If there are multiple high-visibility project with this problem that might change things (Hadoop and Druid are both fairly popular)

My main concern is with SnakeYAML compatibility in that case. Is the goal to get to SnakeYAML 2.x? If I remember correctly, 1.x / 2.x had compatibility problem (which is fair, it being major version bump).

pjfanning commented 7 months ago

@cowtowncoder this all goes back to https://github.com/FasterXML/jackson-jaxrs-providers/issues/134#issuecomment-1180637522

That change has made sure that Hadoop and Druid are stuck on Jackson 2.12 and can't upgrade. I know on the Hadoop side that attempts have been made to switch to Jersey 2 but these have failed. They are stuck indefinitely on Jersey 1 and Jackson does not support Jersey 1. So all changes in other Jackson modules - like the use of snakeyaml 2 are unattainable for Hadoop and Druid (and some other Hadoop reliant Big Data projects).

cowtowncoder commented 7 months ago

@pjfanning Yes, I got that. Which is why I might be more amenable to selected patching of 2.12 branch, not just JAX-RS, and one new full release (but not one per update). Not that I look forward to doing new release(s), but at least there would be reasons to do so for, f.ex, dependency CVEs. And ideally do as a batch.

But it is definitely risky, due to distance between latest (2.16) and 2.12.

pjfanning commented 7 months ago

Is there any chance that we could consider undoing #134 ? That would probably be less work that cherry picking changes and doing releases.

cowtowncoder commented 7 months ago

Looking at change, it's sizable enough that I really do not want to reintroduce it. At least not to the head; I guess theoretically could consider some other version (2.15? leave out from 2.16+). That does not sound good either as acceptable version range would be discontinuous (2.12, 2.15). :-(

But as to cherry-picking, I would want to limit these to things that downstream projects would request -- like, creating an issue, and if we could find people involved, let them list top CVEs (and provide PRs). But keep the list small. So no new features, minimal behavioral changes. So this would be unusual one actually, otherwise it'd be just updating dependencies from pom.xmls.

Not sure that is a good idea either, just writing out my first thoughts.

janjwerner-confluent commented 7 months ago

@pjfanning Thank you for providing the context. @cowtowncoder I'd be happy to do the legwork - provide PRs for the dependency changes, test out things locally, if any of that makes a difference.

cowtowncoder commented 7 months ago

That makes sense. I think YAML one is the tricky one, no matter how we slice it. If I remember correctly, there was no real way to make same code (jackson-dataformat-yaml) work both on latest 1.x and 2.0.

pjfanning commented 6 months ago
cowtowncoder commented 6 months ago

Good points/questions.

I forgot to suggest one more minimal alternative: jackson-dataformat-yaml 2.12.7.1 only. While not SemVer compliant either would limit blast radius. No need for jaxrs providers just needs override for YAML module

On Sat, Dec 9, 2023 at 5:47 AM PJ Fanning @.***> wrote:

  • Apache Druid only seems to use jackson-dataformat-yaml in test code

    https://github.com/search?q=repo%3Aapache%2Fdruid+dataformat-yaml&type=code

  • Releasing a jackson-dataformat-yaml 2.12.7.1 that changes to snakeyaml 2.2 is not a good move from a semver perspective
    • many users will upgrade thinking it's a trivial upgrade and it is not
    • Jackson itself waited till a minor release to this upgrade
  • Can Apache Druid team provide better evidence that they are seriously impacted - in what way can malicious users supply dangerous yaml payloads? @janjwerner-confluent https://github.com/janjwerner-confluent you can provide this privately if this is sensitive - maybe you should notify @.*** I am a member of the ASF Security committee and will receive any emails sent to ASF security@ email addresses.
  • The OSS community would be better served by the Apache Druid team upgrading to jersey 2 than focusing on getting other teams hack their releases for them
  • If Apache Druid team absolutely must upgrade snakeyaml, you can almost certainly upgrade the affected pom.xml to use a newer version com.fasterxml.jackson.dataformat while excluding its Jackson transitive dependencies to prevent you from upgrading the core jackson jars.
    • jackson-dataformat-yaml 2.16.0 looks like it has changes that won't work with jackson-core/jackson-databind 2.12
    • but we'll probably find an imtermediate jackson-dataformat-yaml that does use snakeyaml 2.x and that works with jackson-core/jackson-databind 2.12
    • I can help with reviewing any Druid PRs that attempt to go this route

— Reply to this email directly, view it on GitHub https://github.com/FasterXML/jackson-jaxrs-providers/issues/177#issuecomment-1848414952, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAANOGJKXI62ZTP5NQZGHSLYIRTWTAVCNFSM6AAAAABAMB3MWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBYGQYTIOJVGI . You are receiving this because you were mentioned.Message ID: @.***>

pjfanning commented 6 months ago

I would really much prefer to see the paper trail on this. I am a member of the ASF. I do not see any Apache discussions about this. I think this is completely cart before horse. The ASF warrants public discussions and peer overview. I don't see any Druid discussion going on about this.

In my fork of Apache Druid, there are 50 dependabot issues - only 1 is related to snakeyaml. What about the 49 other dependency issues?

Upgrading snakeyaml also brings in the annoying 5k limit that the snakeyaml maintainer introduced in v1.32. The snakeyaml v2 upgrade did not appear to add extra security improvements, just made the 'safe' constructor behaviour the default. So snakeyaml 1 users can still use 'safe' constructors, they just need to explicitly use them. Maybe I'm wrong but my understanding is that jackson-dataformat-yaml has long used the 'safe' constructor code path. That is, when we upgraded the snakeyaml jar in jackson-dataformat-yaml, we were just trying to pre-empt the people who think security is about ticking boxes about jar versions and not actually reading the CVEs and understanding if they apply to their use case.

janjwerner-confluent commented 6 months ago

Thank you both @pjfanning @cowtowncoder

pjfanning commented 6 months ago

I updated Hadoop to use snakeyaml 2 - see https://issues.apache.org/jira/browse/HADOOP-18658

Until there is proper Druid discussion and reasonable proof that Druid has a real security issue related to snakeyaml - and a reasonable attempt is made to update snakeyaml independently of requiring a special jackson patch, I think this is a bad way to go forward.

janjwerner-confluent commented 6 months ago

@pjfanning I got it, i will look into carving out snakeyaml in Druid - specify version 1.33 as test dependency only, and let other libraries bring their own versions. Thank you for your insight and patience.

cowtowncoder commented 6 months ago

Sounds like a good approach for now. If we can avoid need for new 2.12 releases (full or micro), that would be great.

cowtowncoder commented 6 months ago

Closing for now.