veryfi / veryfi-java

Java module for communicating with the Veryfi OCR API.
MIT License
12 stars 5 forks source link

JSONObject parameter in processDocument #31

Closed DraGengX closed 1 year ago

DraGengX commented 1 year ago

I found a potential misguide in your readme about Java API. I have already sent an email to report this to support. Your example allows the JSONObject to be null. However, I actually tried this and got response: {"status": "fail", "error": "Malformed parameters", "details": [{"loc": ["__root__"], "msg": "One of ['file_data', 'file_url', 'file_urls', 'dictation', 'package_path'] must be supplied", "type": "value_error"}]} I have proved that it is caused by the JSONObject parameter. Because when I add one of those mentioned parameters, it showed a different error message and looks it has the file: {"status": "fail", "error": "Could not decode base64 encoded file, Incorrect padding"}

So the forth parameter JSONObject CAN NOT be null in my opinion.

DraGengX commented 1 year ago

image

Kaevan89 commented 1 year ago

Hi @DraGengX, in both cases you are doing things wrong. In the first case, you are sending a wrong path, or maybe I am wrong, are you sure the right location is D:\TryVeryfi\steak.JPEG, and do you have the right permissions to access that location? In the picture, I saw steak.JPEG in another location.

In the second case, you are trying to hardcode file_data parameter, you can do it, but that is not the right way, to do it you need to send a file_name too, and send as data "D:\TryVeryfi\steak.JPEG" is not going to give you goods results, because file_data is an image encoded in base64.

And to answer your question, if you send an empty JSONObject or a null variable, you'll get the same results, so it's okay to allow sending that value as null.

DraGengX commented 1 year ago

Let me try another picture (maybe not JPEG). I guess it is encoding.

DraGengX commented 1 year ago

It seems that ClassLoader.getSystemResourceAsStream can't read the existing file.

DraGengX commented 1 year ago

I found out what happened. ClassLoader.getSystemResourceAsStream used in process_document in Java SDK only reads files in the resources folder. It doesn't work with absolute path while process_document in Python SDK will work with absolute path.

image

kevkon3 commented 1 year ago

@Kaevan89 I am having the same issue. I have also determined the issue to be caused by how the Java and Kotlin libraries are trying to read the file (I found the issue in Kotlin and came to see if the same issue would be seen in Java).

Since ClassLoader.getSystemResourceAsStream is being used, the file MUST be in the resources directory for it to work properly. This would be fine for test cases that already have the files in the correct location, but is not adequate when files exist elsewhere on the filesystem.

Compare this to the C# library which provides methods for processing the file as a stream, a byte array, or as a file path (not resource): https://github.com/veryfi/veryfi-csharp/blob/master/src/libs/Veryfi/Veryfi.ProcessDocumentFile.cs

kevkon3 commented 1 year ago

@Kaevan89 I am having the same issue. I have also determined the issue to be caused by how the Java and Kotlin libraries are trying to read the file (I found the issue in Kotlin and came to see if the same issue would be seen in Java).

Since ClassLoader.getSystemResourceAsStream is being used, the file MUST be in the resources directory for it to work properly. This would be fine for test cases that already have the files in the correct location, but is not adequate when files exist elsewhere on the filesystem.

Compare this to the C# library which provides methods for processing the file as a stream, a byte array, or as a file path (not resource): https://github.com/veryfi/veryfi-csharp/blob/master/src/libs/Veryfi/Veryfi.ProcessDocumentFile.cs

I was just able to confirm this locally by updating how the file is read. My fix is in kotlin, but easy enough to move over to java. I'm going to open up a kotlin issue and PR for this over there in that project.