drewnoakes / metadata-extractor

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Apache License 2.0
2.55k stars 479 forks source link

File format could not be determined for encrypted InputStream #536

Closed ghost closed 3 years ago

ghost commented 3 years ago

Cannot retrieve metadata from mp3 files.

This cannot determine the type exception: ImageMetadataReader.readMetadata(inputStream);

This cannot read the metadata information even if does not incurr in any exception: ImageMetadataReader.readMetadata(inputStream, inputStream.available(), FileType.Mp3);

In this case I've used an input stream generated by Tink Library with StreamingAEAD config. The metatada reader works fine with video and image files.

Bonus considerations: I've inspected the library and it looks it does not write the bytes to a temp file before reading the metadata. Could you confirm this? Thanks.

drewnoakes commented 3 years ago

Is your stream correctly decrypting the data?

Also inputStream.available() is probably not doing what you think it is.

I've inspected the library and it looks it does not write the bytes to a temp file before reading the metadata. Could you confirm this?

The library does not write any files, no.

ghost commented 3 years ago

Hi.

Is your stream correctly decrypting the data?

The same kind of stream with video or image reads the metadata correctly.

Also inputStream.available() is probably not doing what you think it is.

Lazy silly attempt to make it work. So I've tried to pass the lenght of the encrypted file and the result is the same: no exception and no metadata.

This is how I'm trying to test the result:

for(Directory d: m.getDirectories()){
    for(Tag t : d.getTags()){
        Log.i("metadata", t.getDescription());
    }
}
drewnoakes commented 3 years ago

That code looks fine.

The same kind of stream with video or image reads the metadata correctly.

It would help if you could provide more clarity about what you're trying to do.

If the data is encrypted, you will need to decrypt it before you give it to the library to read. This library expects unencrypted data as input.

ghost commented 3 years ago

I'm trying to read potentially large encrypted media files without writing them onto the hard memory. The library Iìm using for enc/dec is Tink for Android with StreamingAEAD primitive which gives the options to open readable inputstream or SeekableByteChannel.

The input stream I think does not support the reset method. Because this I've been opening different stream one for reading content another for reading metadata.

Another options would be to decrypt and wrap averything inside ByteArrayInputStream but for large files it will lead to memory exception.

I initially tried to use ExifInterface library and MediaMetadataRetriever from google but they don't work for encrypted image nor video while your library actually does.

drewnoakes commented 3 years ago

If you've used the stream to decode the image, you will have to seek back to the beginning before using the same stream with this library. If the stream doesn't support seek, you'll have to re-create it.

Nadahar commented 3 years ago

This might be a red herring, but have you tried to call Mp3MetadataReader reader directly instead of using ImageMetadataReader?

ghost commented 3 years ago

If you've used the stream to decode the image, you will have to seek back to the beginning before using the same stream with this library. If the stream doesn't support seek, you'll have to re-create it.

Yes, I used two different stream and I tried even to only load metadata.

This might be a red herring, but have you tried to call Mp3MetadataReader reader directly instead of using ImageMetadataReader?

Same result. I just tried to decrypt the file to the hard drive and read metadata from File object and it can only print the File name, size and type.

Nadahar commented 3 years ago

Same result. I just tried to decrypt the file to the hard drive and read metadata from File object and it can only print the File name, size and type.

That begs the question: Is the content a valid MP3? It might be that it's without ID3(v2) tags, and that it "confuses" the parser. Is it possible to share a file with the (unencrypted) content of one such stream?

kwhopper commented 3 years ago

Sharing an example file would make this easier to debug, although it's understood if they're copyright and/or private.

ghost commented 3 years ago

Here it is: https://mega.nz/file/70g0jaSK#rfRDMpLq1UashB5-_M96oFOd_mFfLyUFrO7EnNw96d8

I've got it from copyright free website.

drewnoakes commented 3 years ago

The file you've linked is an MP3 that starts with an ID3 tag. This library does not currently support ID3, and that is being tracked by #328. The concept of encryption here was misleading. The other issue is clearer, so I'll close this as a duplicate for now.