apache / jmeter

Apache JMeter open-source load testing tool for analyzing and measuring the performance of a variety of services
https://jmeter.apache.org/
Apache License 2.0
7.97k stars 2.02k forks source link

Use Google Magika for file type detection instead of Apache Tika #6239

Open vlsi opened 2 months ago

vlsi commented 2 months ago

Use case

Currently, JMeter uses Tika to detect file type

Possible solution

We could replace Tika with https://google.github.io/magika/

On the other hand, Magika depends on Tensorflow, which might be a non-trivial dependency

Possible workarounds

No response

JMeter Version

5.6.3

Java Version

No response

OS Version

No response

FSchumacher commented 2 months ago

Do we have a real problem with using Tika? I read the repo of magika, that it is a python (and javascript?) solution. Is it easy to add to our dependencies? Is it working locally (without internet access)?

vlsi commented 2 months ago

I thought tika consumed significant space dependency-wise. Magika model is ~1MiB. However, Magika requires tensorflow, so it would probably involve a lot of deps :( Yes, it works without Internet access