AminSallah / Flow.Launcher.Plugin.Notion

Quick capture plugin for search, create, edit, and delete Notion pages.
MIT License
9 stars 0 forks source link

Pages with only a mention in their title get ignored by the data parser #5

Closed tim-wag closed 5 months ago

tim-wag commented 5 months ago

https://github.com/AminSallah/Flow.Launcher.Plugin.Notion/blob/de285c268e09c61c6e1e6615113b0162bf79a987/src/NotionDataParser.cs#L162

In the line mentioned you can see the title being extracted using the "text" property, which excludes the "mention" properties. This causes my pages titled with only a mention to be considered as having an empty title. I'm creating an issue because I am not sure how to fix this. I tried using the result["properties"][title]["title"][0]["plain_text"] property, which also ignores the mentions unless there is no text and only mentions (this is thing from the Notion API). I don't see any great solution except parsing through all of the titles property and manually creating the title but it isn't very efficient.

What I would still like to see added would be the parsing of the title using this : result["properties"][title]["title"][0]["plain_text"], which would allow my pages to at least be accessible.

AminSallah commented 5 months ago

If using mentions is crucial for your page titles, implementing this method should fix it. However, one consideration to note is that the mentioned date may not be in a humanized format.

string GetFullTitle(JToken titleList)
{
    string extractedTitle = string.Empty;
    foreach (var titleType in titleList)
    {
        extractedTitle += titleType["plain_text"].ToString();
    }
    return extractedTitle;
}
extractedTitle = GetFullTitle (result["properties"][title]["title"]);

This edit will be added in v3, but with the neglect of date mention types., Let me know if this fixing your issue.

tim-wag commented 5 months ago

I changed the code but I don't know how to force the plugin to parse all pages of the databases it has access to and thus see if the changed worked or not. Is there a specific context or a command to do so ?

AminSallah commented 5 months ago

The plugin should refresh the entire cache when restarting Flow Launcher with a good internet connection or by typing reload all plugins data in the Flow Launcher query.

tim-wag commented 5 months ago

Yea idk I may have broken something because I have no pages appearing at all. Anyway you could try to work with that (there may (very probably) be things to adjust) but I am wasting too much time on this :

Already in the code :

foreach (var kvp in properties){
  var values = (JObject)kvp.Value;
  if (values["type"].ToString() == "title"){
    title = kvp.Key;
  }
}

New (using the title Key of the loop above) :

string GetFullTitle (string titleObject) {
  string extractedTitle;
  foreach (const titleElement in titleObject) {
    extractedTitle += title[titleElement]["plain_text"]
  }
  return extractedTitle;
}

if (title) {
  string pageTitle = GetFullTitle(title);
}

The function should loop through every element of the title (text, mentions) in order and add them to a string that is then returned. Btw s the if (title) necessary ?

AminSallah commented 5 months ago

Hey @tim-wag, I see the error now. You're invoking the method with a string type parameter, which doesn't make sense. Instead of a Json Token, you should take a look at this branch. Currently, there aren't many changes, and it seems stable. You can build and use it.

tim-wag commented 5 months ago

Its really weird because I wasn't getting any page through the search (even with normal titles), I deleted everything and reinstalled the version from the store that worked before, but now it doesn't connect to any page. But when linking the other Notion Search extension with the same Notion token, it finds all the pages. It still detects the databases tho, because I see them all in the default and relation databases menus.

binvius commented 5 months ago

Sorry to bump in but we too are facing the same issues. (Also, this has got to easily be one of the best plugins created for any program so thank you so so much for doing such a great job!)

Ignoring 'titles with mentions' for now as even Notions native search struggles to pick those up, we are also finding many pages and databases not being cached and therefore not presented upon a FlowLauncher search.

Perhaps having such a substantial workspace, we thought that it may just be taking a while to cache but it's been a very long time now.

As this is a fresh instal, perhaps we have hit Notion's rate limiter/daily limit and so will have to wait a few days for the Flow.Launcher.Plugin.Notion to fully cache anything. In relation to that, you state that "the plugin should refresh the entire cache when restarting Flow Launcher" - does that mean it is wiping it and starting again or does it have some logic to only update changes? (Should it be the former then for anyone with a workspace larger than Notions daily API limits, they would never be able to use this plugin as it'll never fully cache everything.)

Massive thanks in advance and let me know if you'd prefer a separate issue being opened.

Cheers!

AminSallah commented 5 months ago

Its really weird because I wasn't getting any page through the search (even with normal titles), I deleted everything and reinstalled the version from the store that worked before, but now it doesn't connect to any page. But when linking the other Notion Search extension with the same Notion token, it finds all the pages. It still detects the databases tho, because I see them all in the default and relation databases menus.

Cache databases using a different parser; this resulted in a meaning error affecting one of the nested loops of the pages parser.

AminSallah commented 5 months ago

Ignoring 'titles with mentions' for now as even Notions native search struggles to pick those up, we are also finding many pages and databases not being cached and therefore not presented upon a FlowLauncher search.

Currenlty plugin doesn't ignore any page returned by api whatever it's properites, if it has no title plugin submit it in cache as empty title and appearing in flow search but you can't see it if search bar is not empty. try to filter by database then scroll down till find untitled pages using @ to show databases then press tab key on database to filter results based on it then scroll down looking for untitled

AminSallah commented 5 months ago

Search cache is crucial for the plugin, as it relies on it for editing, deleting, and numerous other functions. Therefore, it is handled differently. The plugin will not save or display any pages if any errors are returned by the Notion API. Even if only one error is returned, the plugin will halt the cache building process and disregard any pages that were successfully returned, opting instead to use the old cache. If there is no existing cache, the plugin will be off and display an error in the search results to prevent any unexpected errors. that's mean when you are looking on search results in FL that's the whole pages shared with your token.

AminSallah commented 5 months ago

As this is a fresh instal, perhaps we have hit Notion's rate limiter/daily limit and so will have to wait a few days for the Flow.Launcher.Plugin.Notion to fully cache anything. In relation to that, you state that "the plugin should refresh the entire cache when restarting Flow Launcher" - does that mean it is wiping it and starting again or does it have some logic to only update changes? (Should it be the former then for anyone with a workspace larger than Notions daily API limits, they would never be able to use this plugin as it'll never fully cache everything.

Currently, the plugin employs two different mechanisms. The first is triggered upon opening the Flow Launcher or restarting it. This mechanism should never trigger rate limit responses because it doesn't burst requests; instead, it operates as slowly as possible. If no errors are returned by the Notion API, it will wipe the old cache and install the new one.

Wiping the cache is crucial here. Users may delete a cached page via the Notion UI, which would still be displayed in Flow Search despite its deletion. There's currently no way to hook into Notion to determine if a page has been deleted, and I can't enforce users to only delete pages via Flow Launcher. Therefore, it's better to periodically clean up the cache.

The second mechanism is a refresh mechanism triggered when you activate Flow Launcher via your hotkey, even if the plugin keyword is not entered. It only captures new pages added by Notion after 14 seconds from their addition.

AminSallah commented 5 months ago

@tim-wag @binvius, I'm planning to release a minor update that should address this issue. You'll need to remove the old plugin and its settings before installing the update. Alternatively, using a Flow Launcher portable version is recommended for testing purposes. Once the update is released, I'll notify you.

AminSallah commented 5 months ago

Hey there! You can now download the new version from here if it has not yet appeared on the plugin store.

https://github.com/AminSallah/Flow.Launcher.Plugin.Notion/releases/download/v2.1.0/Flow.Launcher.Plugin.Notion.zip

Or paste this in FL query

pm install https://github.com/AminSallah/Flow.Launcher.Plugin.Notion/releases/download/v2.1.0/Flow.Launcher.Plugin.Notion.zip
AminSallah commented 5 months ago

@binvius @tim-wag, Let me know if all of your shared pages are now appearing on Flow Search.

tim-wag commented 5 months ago

Hello. Thank you very much, my pages are now all appearing, even those with mentions in their title. It is now very convenient to use the plugin !

However, bugs seems to never stop and I have databases named with "specials" characters because of french. The pages inside of the database are accessible through their title and the database is appearing in the results (with databases name showing fine in both cases) when using a blank "@". This thing is, when I select this database to search in it with Tab, resulting in "$DATABASE$ QUERY", the pages of the database don't show up. This may probably be linked with the unicode form of the characters used but I haven't really dug.

Anyways, have a nice day and thank you !

AminSallah commented 5 months ago

You are welcome, @tim-wag. I thought I fixed this in a previous version, but it turns out I fixed it in the page name only, not the filters. Thanks for bringing it to my attention. Check the release for the update.

AminSallah commented 5 months ago

Everything works fine. Closing this for now. Feel free to reopen if further discussion is required.

binvius commented 4 months ago

@binvius @tim-wag, Let me know if all of your shared pages are now appearing on Flow Search.

@AminSallah

Apologies for the slight delay - been having a nightmare trying to get the API to play nicely and gave up for a while.

Oh, wow - thank you so so so much for all your additional effort as it appears to be working in the latest version! You are a god wizard amongst us mere mortals so all the thanks in the world for what you have achieved will never be sufficient! Cheers!

Sorry - I'm not too sure what you were referring to regarding "untitled pages" or if that response was actually meant for me but your tip for finding untitled pages will no doubt be very useful for any users reading this thread who happen to not name their pages for some reason.

Regarding what I mentioned about "titles with mentions" and the fact that "Notions native search struggles to pick those up"- it is truly fantastic and remarkable that you have somehow successfully managed to have those included in Flow's search results as so many users have been begging Notion for years to fix that issue! With a valuation of well over $10 billion, I have heard stories of Notion paying quite large sums of money to 3rd party devs that have fixed a problem like that so it would be a very good idea if you tell the Customer Service team at Notion to send your solution to their Engineers. The fact that Notions search doesn't pick up mentions but yours does, proves that it is possible, they just need someone fantastic like you to force it on them by repetitively telling them how easy it is to fix. Imagine having such a massive impact on so many people around the world by helping fix that issue with them - aside from making some good cash, you might even end up saving lives, you never know.

I'm not entirely sure I understand what you mean by "the plugin will halt the cache-building process and disregard any pages that were successfully returned" - are you saying that if even one page returns an error, it doesn't simply skip that page like most software does, but instead crashes? Perhaps I'm not understanding but when I've worked with the API, that didn't appear to be a limitation. You seem super intelligent based on the fact you managed to include "titles with mentions" in the search even though Notion's own devs failed, so I'd be really surprised if you aren't able to build some logic in that skips any pages that return errors, rather than just crashing. If that's true, what if the 2nd page returned has an error in - does that mean that that user would only ever have access to the single page returned before that? Sorry if I am confused, I've just never heard anything like this before over the past few decades.

Thank you for confirming consideration was taken regarding Notions rate limits and that "if no errors are returned by the Notion API, it will wipe the old cache and install the new one."

Massive thanks in advance once again.

AminSallah commented 4 months ago

Thanks @binvius for kind words and feedback! I'm glad the latest version seems to be working well for you.

Notion developers use a different API version v3 for search, which offers extremely fast response times and returned a 1,000 page when i tested it at once . However, users are restricted to using v1 via an internal integration token. Initially, the plugin utilized the v3 version by simulating the Notion UI with a cookie session ID, but this posed potential security risks. If the session ID were hijacked, the attacker would have full control over every single page on Notion indefinitely without your knowledge. Therefore, I transitioned to using v1 and building a local cache instead for taking this risk So, my process for handling pages is quite different from the approach taken by the Notion developers.

Regarding the cache-building process, let me clarify my point: The primary reason for using a full cache reinstall is that Notion does not provide a method to know recently deleted pages. when the plugin encounters an error with a page during cache-building, it halts the process and use last cache. This doesn't mean the plugin crashes entirely.

In a full cache build, the plugin requests 100 pages from the Notion API at a time, as this is the maximum allowed number. After processing the first 100 pages, the plugin requests another set of 100 and continues in this loop until it has cached all the pages shared with the token. Unexpected errors may arise, such as hitting the rate limit if the token is used elsewhere along with the plugin, or an internet connection error. In such cases, the plugin will lose at least 100 pages and may get stuck in an infinite loop trying to call the Notion API. As a result, the process is halted and the old cache is used.

In other words, if 1,000 pages are shared with the token and an error occurs at page 800, wiping the old cache and installing a new one, the plugin will assume you have deleted the remaining 200 pages. This is because the plugin runs based on the last cache returned by the successful process. Normal users will not notice they are run on old cache. A full cache build may run automatically without needing to restart or reload plugin data by a different logic. This should resolve issues if a user mistakenly creates a page on FL and then deletes it in the Notion UI.

Thank you again for your thoughtful feedback. I appreciate it, and more improvements based on this issue are coming soon in version 3.