gwu-libraries / TweetSets

Service for creating Twitter datasets for research and archiving.
MIT License
25 stars 2 forks source link

Spark & log4j security #163

Open dolsysmith opened 2 years ago

dolsysmith commented 2 years ago

Spark uses log4j, but I don't think the CVE-2021-44228 vulnerability exposes our application, since we don't expose the Spark UI, and since the data pipeline flows only one way (into the rest of the application from Spark); the only way to interact with Spark is from the command line. But best practice would be either to apply the log4j patch or add the command-line parameter to the Dockerfile to disable the problematic log4j property at startup.