chrismattmann / tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Apache License 2.0
1.51k stars 234 forks source link

Using `InMemoryUploadFile` with tika. #370

Closed hamodey closed 1 year ago

hamodey commented 2 years ago

Hi,

I am posting a file into my db using BE framework of Django. I would like to read the data from the file whilst parsing. However, I am getting the error:

AttributeError: 'InMemoryUploadedFile' object has no attribute 'decode' I assume the issue is that the temp file that django handles differs to what tika is expecting, is there any suggestions or documentation I could have a look at?

Thanks

chrismattmann commented 1 year ago

tika-python doesn't directly handle object streams in parser.from_file (you have to use .from_buffer). But the latest PR I just commit in this release should address that. Thanks @hamodey