google / magika

Detect file content types with deep learning
https://google.github.io/magika/
Apache License 2.0
7.7k stars 402 forks source link

Detection of error situations #64

Open JianYuDeng opened 6 months ago

JianYuDeng commented 6 months ago

If I use the cmd command "copy/b ..." to disguise the file, I can deceive the detection classification. The test result is incorrect. image

reyammer commented 6 months ago

I'm not sure I understand. Is that hhh.zip the same file as flower.jpg?

JianYuDeng commented 6 months ago

I'm not sure I understand. Is that hhh.zip the same file as flower.jpg?

First, I zipped an MP4 file to hhh.zip. Then I disguised hhh.zip to flower.jpg through cmd command, the command is: copy/b flower.jpg + hhh.zip hhh.jpg. The hhh.jpg generated in this way will be recognized as image by magika. Then I changed the suffix of hhh.jpg to zip, and magika also recognized hhh.zip as image. I think this is incorrect, and malicious files may be carried through disguise.