yannick-cw / notion-ocr

Adding OCR support to Notion
Other
140 stars 4 forks source link

"parsing Text failed, expected String, but encountered Array" #6

Closed NearlCrews closed 4 years ago

NearlCrews commented 4 years ago

Hello,

I was testing this out but getting the following error:

~/notion-ocr-0.1$ ./notion-ocr -t "API"
notion-ocr: JSONError "Error in $.results[0].value.properties.source[0][1]: parsing Text failed, expected String, but encountered Array"

Any ideas what I could do to provide debugging information? I'd love to try this, but it's choking whenever I attempt to run it.

yannick-cw commented 4 years ago

Hey, can you try with the 0.1.1 version? I added -v flag for verbose loggin, please post the response json for the getRecordValues request. There seems to be a different format for the image source :)

NearlCrews commented 4 years ago

I just tried with 0.1.2 and had the same problem. Is there a preferred way I could send this outside of GitHub? The output obviously has identifiable information in there and I'd rather not post it. Thanks in advance!

yannick-cw commented 4 years ago

Yes there is no fix in there yet :) You can send to my email on my profile https://github.com/yannick-cw Or you can tweet me at https://twitter.com/y_gldw

replacing the user id fields in the response would also work

yannick-cw commented 4 years ago

For the second problem with

<stdout>: commitAndReleaseBuffer: invalid argument (invalid character)

Can you try setting

export LC_ALL=en_US.UTF-8

Before running it?

NearlCrews commented 4 years ago

Ha, well that was interesting one. I was running this in an LXC and realized there was no LC_ALL set. I did the following as root:

echo "LC_ALL=en_US.UTF-8" >> /etc/environment echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen echo "LANG=en_US.UTF-8" > /etc/locale.conf locale-gen en_US.UTF-8

Logged out and logged back in as a normal user. The binary ran for a while but choked on a particular file with this error:

notion-ocr: JSONError "Error in $.results[0].value.properties.source[0][1]: parsing Text failed, expected String, but encountered Array"

I checked that file and realized I'd added "add_ocr" to a PDF. I removed the tag, reran the script, and it processed the rest just fine. All good!

Thanks, Nate

On Fri, Nov 22, 2019 at 1:57 PM Yannick Gladow notifications@github.com wrote:

For the second problem with

: commitAndReleaseBuffer: invalid argument (invalid character) Can you try setting export LC_ALL=en_US.UTF-8 Before running it? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub , or unsubscribe .
yannick-cw commented 4 years ago

Great! Could you still sent me the json response? I’d prefer to handle the pdf problem gracefully. Probably the getRecordValue request Thanks

Nearl Crews notifications@github.com schrieb am Sa. 23. Nov. 2019 um 02:24:

Ha, well that was interesting one. I was running this in an LXC and realized there was no LC_ALL set. I did the following as root:

echo "LC_ALL=en_US.UTF-8" >> /etc/environment echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen echo "LANG=en_US.UTF-8" > /etc/locale.conf locale-gen en_US.UTF-8

Logged out and logged back in as a normal user. The binary ran for a while but choked on a particular file with this error:

notion-ocr: JSONError "Error in $.results[0].value.properties.source[0][1]: parsing Text failed, expected String, but encountered Array"

I checked that file and realized I'd added "add_ocr" to a PDF. I removed the tag, reran the script, and it processed the rest just fine. All good!

Thanks, Nate

On Fri, Nov 22, 2019 at 1:57 PM Yannick Gladow notifications@github.com wrote:

For the second problem with

: commitAndReleaseBuffer: invalid argument (invalid character) Can you try setting export LC_ALL=en_US.UTF-8 Before running it? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/yannick-cw/notion-ocr/issues/6?email_source=notifications&email_token=AFSCVBNA5SY5TPD5M5RJ7XDQVATTHA5CNFSM4JQHMDH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE6ROZY#issuecomment-557651815 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AFSCVBK3UJPTTJ23IZEX3B3QVATTHANCNFSM4JQHMDHQ .

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/yannick-cw/notion-ocr/issues/6?email_source=notifications&email_token=AAS6365EAOPJ2OWFDUP6KC3QVCA3VA5CNFSM4JQHMDH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE7JXLQ#issuecomment-557751214, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAS636264B4Y4W4GBA2NWI3QVCA3VANCNFSM4JQHMDHQ .

NearlCrews commented 4 years ago

Yep, attached. I changed the URL, but in the end it's just a bunch of business cards if you need it. Thanks again!

On Sat, Nov 23, 2019 at 2:35 AM Yannick Gladow notifications@github.com wrote:

Great! Could you still sent me the json response? I’d prefer to handle the pdf problem gracefully. Probably the getRecordValue request Thanks

Nearl Crews notifications@github.com schrieb am Sa. 23. Nov. 2019 um 02:24:

Ha, well that was interesting one. I was running this in an LXC and realized there was no LC_ALL set. I did the following as root:

echo "LC_ALL=en_US.UTF-8" >> /etc/environment echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen echo "LANG=en_US.UTF-8" > /etc/locale.conf locale-gen en_US.UTF-8

Logged out and logged back in as a normal user. The binary ran for a while but choked on a particular file with this error:

notion-ocr: JSONError "Error in $.results[0].value.properties.source[0][1]: parsing Text failed, expected String, but encountered Array"

I checked that file and realized I'd added "add_ocr" to a PDF. I removed the tag, reran the script, and it processed the rest just fine. All good!

Thanks, Nate

On Fri, Nov 22, 2019 at 1:57 PM Yannick Gladow <notifications@github.com

wrote:

For the second problem with

: commitAndReleaseBuffer: invalid argument (invalid character) Can you try setting export LC_ALL=en_US.UTF-8 Before running it? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <

https://github.com/yannick-cw/notion-ocr/issues/6?email_source=notifications&email_token=AFSCVBNA5SY5TPD5M5RJ7XDQVATTHA5CNFSM4JQHMDH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE6ROZY#issuecomment-557651815

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AFSCVBK3UJPTTJ23IZEX3B3QVATTHANCNFSM4JQHMDHQ

.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub < https://github.com/yannick-cw/notion-ocr/issues/6?email_source=notifications&email_token=AAS6365EAOPJ2OWFDUP6KC3QVCA3VA5CNFSM4JQHMDH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE7JXLQ#issuecomment-557751214 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAS636264B4Y4W4GBA2NWI3QVCA3VANCNFSM4JQHMDHQ

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/yannick-cw/notion-ocr/issues/6?email_source=notifications&email_token=AFSCVBPZW7YVK56K3XO5FW3QVDMMFA5CNFSM4JQHMDH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE7PUYY#issuecomment-557775459, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFSCVBNJ6EKO2BVSDBZVHSDQVDMMFANCNFSM4JQHMDHQ .

Response Body: {"results":[{"role":"editor","value":{"id":"2c87c3cd-7117-401c-bf9a-acc8d59dd9bd","version":1,"type":"pdf","properties":{"title":[["Scannable_Document_on_Apr_19_2018_at_4_40_22_AM.pdf"]],"source":[["https://s3-us-west-2.amazonaws.com/secure.notion-static.com/XXXXXXX-4eac-4dcf-af07-8bebce78d34d/Scannable_Document_on_Apr_19_2018_at_4_40_22_AM.pdf",[["a","https://s3-us-west-2.amazonaws.com/secure.notion-static.com/XXXXXXX-4eac-4dcf-af07-8bebce78d34d/Scannable_Document_on_Apr_19_2018_at_4_40_22_AM.pdf"]]]]},"created_by":"eb46f37a-5714-4096-92ce-6570c0a5f404","created_time":1573882923274,"last_edited_by":"eb46f37a-5714-4096-92ce-6570c0a5f404","last_edited_time":1573882923274,"parent_id":"a230223a-34c9-48e0-ab84-91c1e6108fe6","parent_table":"block","alive":true,"file_ids":["06b4d828-4eac-4dcf-af07-8bebce78d34d"],"ignore_block_count":true,"created_by_table":"notion_user","created_by_id":"eb46f37a-5714-4096-92ce-6570c0a5f404","last_edited_by_table":"notion_user","last_edited_by_id":"eb46f37a-5714-4096-92ce-6570c0a5f404"}}]} notion-ocr: JSONError "Error in $.results[0].value.properties.source[0][1]: parsing Text failed, expected String, but encountered Array"

yannick-cw commented 4 years ago

Thanks! Wow their json structure is crazy at times:

"source": [
            [
              "https://s3-us-west-2.amazonaws.com/secure.notion-static.com/XXXXXXX-4eac-4dcf-af07-8bebce78d34d/Scannable_Document_on_Apr_19_2018_at_4_40_22_AM.pdf",
              [
                [
                  "a",
                  "https://s3-us-west-2.amazonaws.com/secure.notion-static.com/XXXXXXX-4eac-4dcf-af07-8bebce78d34d/Scannable_Document_on_Apr_19_2018_at_4_40_22_AM.pdf"
                ]
              ]
            ]
          ]