DependencyTrack / dependency-track

Dependency-Track is an intelligent Component Analysis platform that allows organizations to identify and reduce risk in the software supply chain.
https://dependencytrack.org/
Apache License 2.0
2.57k stars 540 forks source link

Sudden failure to decrypt the OSS Index API Token #2366

Open carniz opened 1 year ago

carniz commented 1 year ago

Current Behavior

We're seeing something that resembles an expiration of the OSS Index API Token: after running without issues for a couple of days with our API Token (created at https://ossindex.sonatype.org/user/settings) , we suddenly see messages such as these in the logs:

2022-12-28 08:00:27,749 ERROR [OssIndexAnalysisTask] An error occurred decrypting the OSS Index API Token. Skipping javax.crypto.BadPaddingException: Given final block not properly padded. Such issues can arise if a bad key is used during decryption. at java.base/com.sun.crypto.provider.CipherCore.unpad(Unknown Source) at java.base/com.sun.crypto.provider.CipherCore.fillOutputBuffer(Unknown Source) at java.base/com.sun.crypto.provider.CipherCore.doFinal(Unknown Source) at java.base/com.sun.crypto.provider.AESCipher.engineDoFinal(Unknown Source) at java.base/javax.crypto.Cipher.doFinal(Unknown Source) at alpine.security.crypto.DataEncryption.decryptAsBytes(DataEncryption.java:136) at alpine.security.crypto.DataEncryption.decryptAsBytes(DataEncryption.java:148) at alpine.security.crypto.DataEncryption.decryptAsString(DataEncryption.java:176) at org.dependencytrack.tasks.scanners.OssIndexAnalysisTask.inform(OssIndexAnalysisTask.java:138) at org.dependencytrack.tasks.VulnerabilityAnalysisTask.performAnalysis(VulnerabilityAnalysisTask.java:169) at org.dependencytrack.tasks.VulnerabilityAnalysisTask.analyzeComponents(VulnerabilityAnalysisTask.java:128) at org.dependencytrack.tasks.VulnerabilityAnalysisTask.inform(VulnerabilityAnalysisTask.java:77) at alpine.event.framework.BaseEventService.lambda$publish$0(BaseEventService.java:101) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

When this happens no new (OSS Index fed) vulnerabilities are detected until we create a new API token and and set it in /admin -> Analyzers -> Sonatype OSS Index

Steps to Reproduce

  1. Create an API Token at https://ossindex.sonatype.org/user/settings
  2. Enter the API token in /admin -> Analyzers -> Sonatype OSS Index
  3. Let DTrack run for a while (a week or two) while you upload SBOMs
  4. Check the logs

Expected Behavior

The OSS Index API Token never "expires"

Dependency-Track Version

4.7.0

Dependency-Track Distribution

Container Image

Database Server

PostgreSQL

Database Server Version

13.7

Browser

N/A

Checklist

nscuro commented 1 year ago

Thanks for reporting @carniz!

This error happens when the secret couldn't be decrypted using the secret key (stored in /data/keys/secret.key). A common cause is restoring the database from backup, but not also restoring the secret key that was used to encrypt the values in the database.

Few questions to better understand the situation:

rkg-mm commented 1 year ago

I am seeing the same issue - not the first time. But what @nscuro said makes sense. I don't have the /data volume mounted, so if there is a key persisted there, any upgrade will make this fail? Is that somewhere in the docs? I thought mounting it is optional as it holds only caches :(

stevespringett commented 1 year ago

See https://docs.dependencytrack.org/getting-started/data-directory/

stevespringett commented 1 year ago

The "optional" part is overriding the location of the data directory. But the directory itself is mandatory. https://docs.dependencytrack.org/getting-started/deploy-docker/

nscuro commented 1 year ago

Although that doc page 👆 does not mention the secret key and its purpose. Will need to add that. I was also planning to add "backup & restore" documentation.

Anyway, there's some additional documentation regarding the secret key here: https://docs.dependencytrack.org/getting-started/configuration/#secret-key

rkg-mm commented 1 year ago

Yes please add some hint about this in the docs, that would help :)

carniz commented 1 year ago

Few questions to better understand the situation:

  • Did you upgrade from a previous DT version, or is this an entirely new instance?

This is an entirely new instance that was freshly launched with DT 4.7.0 on Dec 21.

  • Have you mounted the container's /data volume somehow so it is persisted across container restarts?

Yes: the /data directory of the container is bind-mounted to a named volume:

# docker-compose config | grep -A38 dtrack-apiserver | head -39 | sed "s/\(ALPINE_.*:\) \(.*\)/\1 here be dragons/"
  dtrack-apiserver:
    deploy:
      resources:
        limits:
          memory: 24576m
        reservations:
          memory: 16384m
      restart_policy:
        condition: on-failure
    environment:
      ALPINE_DATABASE_DRIVER: here be dragons
      ALPINE_DATABASE_MODE: here be dragons
      ALPINE_DATABASE_PASSWORD: here be dragons
      ALPINE_DATABASE_POOL_ENABLED: here be dragons
      ALPINE_DATABASE_POOL_IDLE_TIMEOUT: here be dragons
      ALPINE_DATABASE_POOL_MAX_LIFETIME: here be dragons
      ALPINE_DATABASE_POOL_MAX_SIZE: here be dragons
      ALPINE_DATABASE_POOL_MIN_IDLE: here be dragons
      ALPINE_DATABASE_URL: here be dragons
      ALPINE_DATABASE_USERNAME: here be dragons
      ALPINE_OIDC_CLIENT_ID: here be dragons
      ALPINE_OIDC_ENABLED: here be dragons
      ALPINE_OIDC_ISSUER: here be dragons
      ALPINE_OIDC_TEAMS_CLAIM: here be dragons
      ALPINE_OIDC_TEAM_SYNCHRONIZATION: here be dragons
      ALPINE_OIDC_USERNAME_CLAIM: here be dragons
      ALPINE_OIDC_USER_PROVISIONING: here be dragons
    image: dependencytrack/apiserver:4.7.0
    logging:
      driver: json-file
      options:
        max-file: '20'
        max-size: 100m
    ports:
    - published: 8081
      target: 8080
    restart: unless-stopped
    volumes:
    - dependency-track:/data:rw
  • Has the instance been running the whole time, or was it restarted somewhere in between?

The instance has been up since it was launched on Dec 21.

  • Can you correlate the behavior you're seeing with container restarts, or is this completely unrelated?

I wish I would have been able to see such correlation (since it also was my first suspicion), but from what I can see (using docker ps) it seems as the container has been running for at least 2 weeks: the CREATED column says "2 weeks ago" and STATUS says "Up 2 weeks". The granularity of docker ps does however not show the exact start time of the container process, so it might have been started anywhere between Dec 21 and Dec 27 depending on the semantics of "Up 2 weeks" (i.e. "exactly" | "more than")

carniz commented 1 year ago

For completeness:

# ls -l $(docker volume inspect compose_dependency-track | jq -r '.[].Mountpoint')/.dependency-track/keys
total 12
-rw-r--r-- 1 ubuntu ubuntu 2373 Dec 21 09:27 private.key
-rw-r--r-- 1 ubuntu ubuntu  550 Dec 21 09:27 public.key
-rw-r--r-- 1 ubuntu ubuntu   32 Dec 21 09:27 secret.key

And I figured out a way to display the detailed timestamps:

# docker inspect -f '{{ .Created }}' compose_dtrack-apiserver_1 
2022-12-21T09:26:50.682136863Z

And here's an interesting detail:

# docker inspect -f '{{ .State.StartedAt }}' compose_dtrack-apiserver_1 
2023-01-08T12:47:57.504223704Z

So, the apiserver restarted on Sunday (for unknown reasons ATM), and the OssIndexAnalysisTask seems to be able to use the API token just fine - so far:

# apiserver-logs | grep OssIndexAnalysisTask | tail -10
2023-01-11 08:43:45,544 INFO [OssIndexAnalysisTask] Analyzing 6 component(s)
2023-01-11 08:43:45,544 INFO [OssIndexAnalysisTask] Sonatype OSS Index analysis complete
2023-01-11 08:44:04,664 INFO [OssIndexAnalysisTask] Analyzing 128 component(s)
2023-01-11 08:44:15,076 INFO [OssIndexAnalysisTask] Analyzing 1 component(s)
2023-01-11 08:44:15,076 INFO [OssIndexAnalysisTask] Sonatype OSS Index analysis complete
2023-01-11 08:44:39,850 INFO [OssIndexAnalysisTask] Analyzing 128 component(s)
2023-01-11 08:45:02,901 INFO [OssIndexAnalysisTask] Analyzing 128 component(s)
2023-01-11 08:45:21,489 INFO [OssIndexAnalysisTask] Analyzing 128 component(s)
2023-01-11 08:45:45,410 INFO [OssIndexAnalysisTask] Analyzing 12 component(s)
2023-01-11 08:45:45,410 INFO [OssIndexAnalysisTask] Sonatype OSS Index analysis complete

The token has thus survived a restart, or at least this specific restart, without problems. On the other hand, something happened on Dec 28 that made it unable to decrypt the value from the DB.

Ripolin commented 1 year ago

Hi,

I have this issue too when running dtrack on Docker Swarm. After a couple of days I have to restart api service because api is unreachable and in logs I see ERROR [OssIndexAnalysisTask] An error occurred decrypting the OSS Index API Token. Skipping javax.crypto.BadPaddingException: Given final block not properly padded. Such issues can arise if a bad key is used during decryption.

FYI data is persisted in a dedicated Docker Volume and I use DTrack version 4.7.0.

Any workaround ?

yangsec888 commented 1 year ago

We have this issue too - the OSS index service api key would run for awhile then it stopped working (when the same error being throwed out). The only way to resolve it is to regen another token from sonatype user profile. I think there is a rate-limit threadhold in the api consumption and it would stop working after the threadhold met.

I did try to contact Sonatype for purchasing a commercial license on this service only. But they would not be able sell it without purchase the whole product suite. It's frustrating but that's the trade-off of 'free' service.

Legeril commented 11 months ago

This is still a thing in 4.8.2 with a mounted /data directory. Please get a fix

nscuro commented 3 months ago

To everyone experiencing this issue, please keep an eye on your logs after upgrading to 4.11. We introduced changes in that version to provide more information if this error occurs.

Also note that we will now fall back to unauthenticated usage of OSS Index, if decryption of the API key fails. So you'll not experience a full-blown service disruption anymore.

I tried a lot of things to reproduce, but was unable to do so. Sharing extended exception details as introduced in 4.11 would help me to pinpoint where the issue might be.