Closed xavbart closed 4 years ago
Okay this is gonna be a bit more complicated. For my setup I can take the space
id to search and it finds all my content, it seems you have a different setup, that could be why it does not find anything. If we want to find out whats going on we'd need to try:
developer tools -> network
loadUserContent
request from this verbose output, in which field this id is writtenI guess your search in the browser does not use the id
from the request body you posted above / or maybe the "table": "space"
is different...
Indeed it seems to call another ID than the one used by the notion-ocr exec
{query: "add_ocr", table: "space", id: "5eef59c4-4417-4cae-a1a4-2be5cafd8a24", limit: 20}
Any part of the response to https://www.notion.so/api/v3/loadUserContent
that would be relevant ? I have actually 4 different notion spaces altogether so could this be the issue ? (it searches inside one only, and the wrong one ?)
Perfect, it is the spaces, I am going to push a fix for that and than we can try
Would you want a sample of the JSON describing my multi-space situation? (although I am not sure I can make sense of the structure as is)
No I think I found the problem, created multi space locally and tested it, maybe if my solution does not work :)
@xavbart can you try with 0.1.4
?
Just did (saw your update). It does find things, but failed on retrieving a page that seems to annoy it.
notion-ocr: JSONError "Error in $.recordMap.block['53871674-0724-461f-990d-74e555371f32'].value.properties.title[1][1]: parsing Text failed, expected String, but encountered Array
For info, it seems to point to that page https://www.notion.so/We-the-Doers-Fiverr-s-Entrepreneurial-Populism-and-a-3-Days-Workweek-THE-ENTREPRECARIAT-734d707063d04189a58c6a673cb3670d
(whose title has maybe an issue in parsing because of the pipe | ?)
So that stops your execution. I may change it to allow further execution unless you want me to see it as a test for 0.1.5.
UPDATE:
ignore above. It seems it fails on YOUR page as I saved it in Notion (duh) and it has some issues. I'll try and put it in the trash to see if it does skip it.
UPDATE 2:
Ok that was it. I had saved your description page in Notion as you can see here
https://www.notion.so/Search-In-Your-Notion-Images-to-for-Notion-53c74ca643a344969517624fa56eb244
and it would not parse it properly.
Now it does parse the images of all pages if I put above page in the bin. But it doesn't if this page is visible. I'll leave it available for you to copy it across if you need to test?
I’d like to debug the problem, can you send me the response when you run it verbose? Especially the value.properties.title part would be interesting ;)
XavBart notifications@github.com schrieb am Fr. 29. Nov. 2019 um 18:09:
Just did (saw your update). It does find things, but failed on retrieving a page that seems to annoy it. notion-ocr: JSONError "Error in $.recordMap.block['53871674-0724-461f-990d-74e555371f32'].value.properties.title[1][1]: parsing Text failed, expected String, but encountered Array For info, it seems to point to that page https://www.notion.so/We-the-Doers-Fiverr-s-Entrepreneurial-Populism-and-a-3-Days-Workweek-THE-ENTREPRECARIAT-734d707063d04189a58c6a673cb3670d (whose title has maybe an issue in parsing because of the pipe | ?) So that stops your execution. I may change it to allow further execution unless you want me to see it as a test for 0.1.5.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/yannick-cw/notion-ocr/issues/8?email_source=notifications&email_token=AAS6365UJJGHQQ7QESVCSFLQWFED5A5CNFSM4JR63762YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFPJM7I#issuecomment-559847037, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAS6365NXZM77RQPA6C4RLDQWFED5ANCNFSM4JR6376Q .
This is the one I quoted at beginning of my feedback
notion-ocr: JSONError "Error in $.recordMap.block['53871674-0724-461f-990d-74e555371f32'].value.properties.title[1][1]: parsing Text failed, expected String, but encountered Array
Oh sorry you meant the JSON blob. Sure.
I isolated the part of that page in response which has an "array in title" :
{ "role" : "editor", "value" : { "id" : "53871674-0724-461f-990d-74e555371f32", "version" : 1, "type" : "text", "properties" : { "title" : [ [ "In the line " ], [ "right", [ [ "b" ] ] ], [ " below any image in notion write " ], [ "add_ocr", [ [ "c" ] ] ], [ ", the next time the tool runs, it replaces that with the text scanned from the image." ] ] }, "created_by" : "4ca61bfb-f1b8-409e-ba27-0fedb84839d6", "created_time" : 1574508285559, "last_edited_by" : "4ca61bfb-f1b8-409e-ba27-0fedb84839d6", "last_edited_time" : 1574508285559, "parent_id" : "38ad2b8f-ac27-46be-9e45-b139163da860", "parent_table" : "block", "alive" : true, "ignore_block_count" : true, "created_by_table" : "notion_user", "created_by_id" : "4ca61bfb-f1b8-409e-ba27-0fedb84839d6", "last_edited_by_table" : "notion_user", "last_edited_by_id" : "4ca61bfb-f1b8-409e-ba27-0fedb84839d6" } }
Tell me if you need more, but as you have the page itself, you might find out why it was considering the block title as an array (hint: it seems because of formatting, they -Notion- put inside array elements the differently formatted text parts and attach formatting inside each element, or so it seems)
(so yes, this sort of issue might happen more often on any potential content from users.)
I think with 0.1.5
this is fixed now
brew upgrade notion-ocr
is not seeing latest 0.1.5.
Am I doing something wrong?
Warning: yannick-cw/tap/notion-ocr 0.1.4 already installed
Likewise for a re-install:
Warning: yannick-cw/tap/notion-ocr 0.1.4 is already installed and up-to-date To reinstall 0.1.4, run 'brew reinstall notion-ocr'
Ah sorry, I did not update the brew package yet, my mac is at work, will do on moday
Ah sorry, I did not update the brew package yet, my mac is at work, will do on moday
No prob, I'll use the other install methods.
@xavbart I also released brew again!
Yep, now this seems fixed. Guess the whole issue can be closed. Thanks.
As requested. I have images marked accordingly : see an example of page here https://www.notion.so/xavbart/Image-recog-test-32473ad27380455582c08b62f7009e81 and, running verbose, I can see the notion API does find my notion account and returns the large JSON covering my account details. But the query for the add_ocr tag is returning nothing.