jeremylong / DependencyCheck

OWASP dependency-check is a software composition analysis utility that detects publicly disclosed vulnerabilities in application dependencies.
https://owasp.org/www-project-dependency-check/
Apache License 2.0
6.45k stars 1.28k forks source link

owasp maven plugin 5.0.0-M1 uses user environment to determine encoding for dependency parsing #1742

Open jjYBdx4IL opened 5 years ago

jjYBdx4IL commented 5 years ago
[INFO] --- dependency-check-maven:5.0.0-M1:check (default) @ tests2html ---
[INFO] Central analyzer disabled
[INFO] Checking for updates
[INFO] Skipping NVD check since last check was within 4 hours.
[INFO] Skipping RetireJS update since last update was within 24 hours.
[INFO] Check for updates complete (9 ms)
[INFO] Analysis Started
[WARNING] An unexpected error occurred during analysis of '/home/mark/.m2/repository/com/javaslang/javaslang/2.0.0-beta/javaslang-2.0.0-beta.jar' (Archive Analyzer): Malformed input or input contains unmappable characters: javaslang/?$Type$1ReflectionUtil.class
[ERROR]
java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: javaslang/?$Type$1ReflectionUtil.class
    at sun.nio.fs.UnixPath.encode (UnixPath.java:145)
    at sun.nio.fs.UnixPath.<init> (UnixPath.java:69)
    at sun.nio.fs.UnixFileSystem.getPath (UnixFileSystem.java:280)
    at java.nio.file.Path.resolve (Path.java:515)
    at org.owasp.dependencycheck.analyzer.ArchiveAnalyzer.extractArchive (ArchiveAnalyzer.java:536)
    at org.owasp.dependencycheck.analyzer.ArchiveAnalyzer.extractFiles (ArchiveAnalyzer.java:409)
    at org.owasp.dependencycheck.analyzer.ArchiveAnalyzer.extractAndAnalyze (ArchiveAnalyzer.java:251)
    at org.owasp.dependencycheck.analyzer.ArchiveAnalyzer.analyzeDependency (ArchiveAnalyzer.java:233)
    at org.owasp.dependencycheck.analyzer.AbstractAnalyzer.analyze (AbstractAnalyzer.java:136)
    at org.owasp.dependencycheck.AnalysisTask.call (AnalysisTask.java:88)
    at org.owasp.dependencycheck.AnalysisTask.call (AnalysisTask.java:37)
    at java.util.concurrent.FutureTask.run (FutureTask.java:264)
    at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
    at java.lang.Thread.run (Thread.java:834)

That occurs only when LC_ALL=C, not when LC_ALL=de_DE.UTF-8.

dranuhl commented 5 years ago

Does your pom.xml specify the <project.build.sourceEncoding> property? If not Maven will actually use the default encoding as determined by the JVM, which uses the environment itself. Normally Maven emits a warning if this property is not specified, which includes the assumed encoding it will use. You might want to check on this.

jjYBdx4IL commented 5 years ago

even if, those jars that are being parsed do not care about those settings.

This is the class in question:

https://static.javadoc.io/io.javaslang/javaslang/2.0.0/javaslang/%CE%BB.html

Making your plugin dependent on the user environment is not a good idea because the next plugin will require contradictory requirements.

It seems that this is a limitation of Java's Path.resolve itself and how Java generally handles file names. Maven "clean" had this issue, too, where it failed to clean if there were "invalid" filenames around, which is kinda funny because those files actually existed on disk - so they really weren't invalid from the OS perspective.

https://stackoverflow.com/questions/22775758/java-io-file-accessing-files-with-invalid-filename-encodings

The gist is: this will break stuff potentially everywhere where the local charset is not as powerful as Unicode, ie some regular Windows code pages etc.

jjYBdx4IL commented 5 years ago

Only solution I can think about to make this work in every case: only use ASCII filenames when extracting, maybe don't use the filenames from the archive at all, instead replace them with sequential numbers + .ext. And keep a mapping somewhere outside of the user's filesystem name space.

ninaDeimos commented 5 years ago

We're having the same problem here. We're using v5.2.0

[DependencyCheck] [WARN] An unexpected error occurred during analysis of '/data/jobs/c3-fehlertool-nightly/workspace/deployment/build/exploded/installer/gradle/wrapper/gradle-5.0-bin.zip' (Archive Analyzer): Malformed input or input contains unmappable characters: javaslang/?.class
[DependencyCheck] [ERROR] 
[DependencyCheck] java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: javaslang/?.class
[DependencyCheck]   at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:145)

But in our case, setting LC_ALL=de_DE.UTF-8 had no effect. Is there a workaround for this? Because many of our projects use gradle 5.0, which happens to trigger this error, so I have to disable the Dependency Check for all of them at the moment...

Edit: For the moment I downgrade to v4.0.2

Moes81 commented 5 years ago

Any update on this one? We're running in the same issue: An unexpected error occurred during analysis of '/root/.gradle/caches/modules-2/files-2.1/org.jetbrains.kotlin/kotlin-compiler-embeddable/1.3.50/1251c1768e5769b06c2487d6f6cf8acf6efb8960/kotlin-compiler-embeddable-1.3.50.jar' (Archive Analyzer): Malformed input or input contains unmappable characters: javaslang/?.class

Moes81 commented 5 years ago

Any update on this one? We're running in the same issue: An unexpected error occurred during analysis of '/root/.gradle/caches/modules-2/files-2.1/org.jetbrains.kotlin/kotlin-compiler-embeddable/1.3.50/1251c1768e5769b06c2487d6f6cf8acf6efb8960/kotlin-compiler-embeddable-1.3.50.jar' (Archive Analyzer): Malformed input or input contains unmappable characters: javaslang/?.class

In our case, we ran the dependencyCheck inside a docker container. Upgrading to a container with the latest JDK solved the problem. Some funny guy thought it would be a good idea to name the class "Lambda.class" actually "λ.class". Thank you for that great idea! :-D

jeremylong commented 5 years ago

Can someone facing this issue try setting:

export JAVA_OPTS="-Dsun.jnu.encoding=UTF-8 -Dfile.encoding=UTF-8 

Note - if running via Gradle you may need to set GRADLE_OPTS instead with ./gradlew --no-daemon

ninaDeimos commented 5 years ago

Can someone facing this issue try setting:

export JAVA_OPTS="-Dsun.jnu.encoding=UTF-8 -Dfile.encoding=UTF-8 

I use the Jenkins plugin, so I don't think I can do that

In our case, we ran the dependencyCheck inside a docker container. Upgrading to a container with the latest JDK solved the problem. Some funny guy thought it would be a good idea to name the class "Lambda.class" actually "λ.class". Thank you for that great idea! :-D

@Moes81 I see this error in our case too, but I would be interested which JDK you ended up using. I tried the newest releases from Amazon Corretto (https://docs.aws.amazon.com/corretto/index.html) and the "Lambda.class" is still named "λ.class" there... I guess you used the Oracle JDK?

Moes81 commented 5 years ago

@ninaDeimos This container works for us now: adoptopenjdk/openjdk11:jdk-11.0.4_11-alpine

ninaDeimos commented 5 years ago

@Moes81 Ok, so I guess my mistake was, that I assumed the Jenkins DependencyCheck plugin would use the JDK configured for the job, but it looks like it uses the JDK that Jenkins itself uses...

patrickherrera commented 4 years ago

@jeremylong I tried your JAVA_OPTS. Without it I was getting this in my terminal: Malformed input or input contains unmappable characters: javaslang/?.class

With it, it still fails but I get the following line displayed instead: Malformed input or input contains unmappable characters: javaslang/λ.class

The build fails in Docker using the following JDK: 11.0.6 (Debian 11.0.6+10-post-Debian-1deb10u1), but succeeds outside Docker running against my locally installed JDK: 11.0.5 (Oracle Corporation 11.0.5+10) I'm using Gradle 5.6.4 in all instances. I'll keep playing around and post if I find anything else

patrickherrera commented 4 years ago

Just adding ENV LC_ALL C.UTF-8 to my Dockerfile was sufficient to fix this for me (taken from https://stackoverflow.com/a/41648500/647581). My Docker image is based on Debian 'buster'

seanf commented 4 years ago

Thanks to the suggestion by @jeremylong, I found that adding this to the environment worked:

export MAVEN_OPTS=-Dsun.jnu.encoding=UTF-8 -Dfile.encoding=UTF-8
functicons commented 1 year ago

Is it possible to provide an option to make the error non-fatal? i.e., if failed to read certain files, do not fail the whole scanning process, instead just log the errors.

marcelstoer commented 4 months ago

What is the plan with this issue? If it won't/can't be fixed in the plugin, should it be closed and the workaround documented somewhere?

I'm on ODC 10.0.1 running in a node:20-bookworm-slim (Debian) container. I can confirm that setting LC_ALL=C.UTF-8 fixes the issue.

jeremylong commented 4 months ago

@marcelstoer I think documentation is the best option. Just to confirm - to your dockerfile just added:

ENV: LC_ALL=C.UTF-8
marcelstoer commented 4 months ago

Yes, setting LC_ALL works.