decaporg / decap-cms

A Git-based CMS for Static Site Generators
https://decapcms.org
MIT License
17.95k stars 3.05k forks source link

Speed extremely slow with populating relation dropdown (several minutes) #5920

Open delwin opened 3 years ago

delwin commented 3 years ago

Describe the bug We have a template that uses four different list fields - each one allowing the administrator to specify any number of items from a collection to be displayed on a page. The list is defined with a relation field widget, similar to the code below. When there were fewer than 30 items in the respective collections and fewer than 5 or so items selected for each of the four lists this was slow, but acceptable (taking a second or two to populate the relation field dropdown). With 140 and 60 items in each of the referenced collections (stories and news), and 77 selected from one collection and 8 from the smaller collection the waiting time for drop-down population seems to be on the order of several minutes, and this population takes place every time a new item is added to the list, or an item's relation drop-down is clicked to edit it. There seems to be no caching of values between drop-downs despite pulling from the same source:

      - label: "Featured Stories⁺"
        singular_label: "Featured Story"
        hint: "⁺will be displayed in the order provided above."
        name: "featured_stories"
        widget: list
        ui: fields
        required: true
        allow_add: true
        fields:
        - label: "Altas post/story"
          name: "story"
          widget: "relation"
          collection: "atlas"
          search_fields: ["title"]
          value_field: "uuid"
          display_fields: ["title", "thumb_new"]
          hint: "Start typing the title of the Atlas post/story then select from the dropdown list"
        - {label: "Don't show thumbnail in list", name: "no_thumb", widget: "boolean", default: false, required: false}

To Reproduce

  1. Create a collection of 150 items,
  2. Create a field with the above displayed definition, referencing this collection
  3. Start adding items to the list and watch the performance take longer and longer until it's basically unusable for the average site editor

Expected behavior I would expect that the performance for selecting in this sort of situation be good enough at least through a thousand or more items in the collection and at least a few hundred or more items selected from that collection

Applicable Versions:

CMS configuration See relevant snipet, above

Additional context If required, we can provide a copy of the repository with all config and collection files for analysis to a Netlify CMS engineer (not for direct public release)

erezrokah commented 3 years ago

Hi @delwin, related to https://github.com/netlify/netlify-cms/issues/4635

There seems to be no caching of values between drop-downs despite pulling from the same source:

Initial load can be slow as it needs to load all entries from the related connection. Once entries are loaded they should be cached, so that's definitely something we should look into. I also experience GitLab to be slower than GitHub (I don't have official benchmarks though).

I'll try to create a test repo to simulate this case.

Can you share the browser network traffic (by opening the browser developer tools and going to the network tab)? The browser should indicate if requests are cached or not. For example this is how Chrome shows cached requests:

image
delwin commented 3 years ago

I can share the browser network traffic, but it goes on for ages - 10s of thousands of requests. I'll try to get this for you. It's definitely not caching.

I can provide you with a cloned repository for you to try this on, as well.

On Fri, 12 Nov 2021 at 14:52, Erez Rokah @.***> wrote:

Hi @delwin https://github.com/delwin, related to #4635 https://github.com/netlify/netlify-cms/issues/4635

There seems to be no caching of values between drop-downs despite pulling from the same source:

Initial load can be slow as it needs to load all entries from the related connection. Once entries are loaded they should be cached, so that's definitely something we should look into. I also experience GitLab to be slower than GitHub (I don't have official benchmarks though).

I'll try to create a test repo to simulate this case.

Can you share the browser network traffic (by opening the browser developer tools and going to the network tab)? The browser should indicate if requests are cached or not. For example this is how Chrome shows cached requests: [image: image] https://user-images.githubusercontent.com/26760571/141477653-954681f8-5b23-42a3-9799-21e7f5e3dbf9.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/netlify/netlify-cms/issues/5920#issuecomment-967133458, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC6GKEU5C5M2D4FJ5FENVDULULYZANCNFSM5GQRRPNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

delwin commented 3 years ago

In fact, I'm still waiting for the admin page to completely load and I'm on 37k requests, most of the most recent are, of course, receiving 429 Too Many Requests responses.

image

delwin commented 3 years ago

The final 43k requests took almost 6 minutes to "finish" (many with a 429 response from gitlab). Once done I am not getting further requests when using a new dropdown to select a post except for the following requests (after clearing the network traffic from the console, to ensure I get the right requests):

image

erezrokah commented 3 years ago

Thanks @delwin, that's helpful - it seems you're hitting GitLab's rate limits. Looks like these changed not so long ago: https://docs.gitlab.com/ee/user/gitlab_com/index.html#gitlabcom-specific-rate-limits

I'll set up a reproduction and see if we can be more efficient. I'll try to post some answers next week.

delwin commented 3 years ago

Thanks Erez

On Fri, 12 Nov 2021 at 17:01, Erez Rokah @.***> wrote:

Thanks @delwin https://github.com/delwin, that's helpful - it seems you're hitting GitLab's rate limits. Looks like these changed not so long ago: https://docs.gitlab.com/ee/user/gitlab_com/index.html#gitlabcom-specific-rate-limits

I'll set up a reproduction and see if we can be more efficient. I'll try to post some answers next week.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/netlify/netlify-cms/issues/5920#issuecomment-967226529, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC6GKH5DCF3YKJO5PHPUQTULU22LANCNFSM5GQRRPNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

erezrokah commented 2 years ago

The final 43k requests took almost 6 minutes to "finish" (many with a 429 response from gitlab). Once done I am not getting further requests when using a new dropdown to select a post except for the following requests (after clearing the network traffic from the console, to ensure I get the right requests):

Ok, so I finally was able to dig into this. I created a test repo here. It has a collection referencing another collection that has 500 entries.

What's you're seeing (with the 429 response) is GitLab rate limiting the initial load. The CMS identifies this situation and uses a backoff mechanism to handle it (basically slows everything down). After the initial load, not only the CMS caches the requests using the browser IndexedDB (not the built-in cache mechanism my initial image shows), it also saves a local representation of the git tree for that collection. That means, that in further attempts to retrieve the collection the CMS only retrieves the difference (the changes to the folder) instead of retrieving all the files. The reason for this, is that GitLab paginates listing of files, so without this optimization we'll need to retrieve all pages (in comparison GitHub lets you list 100K files in a single request).

This is how it looks like in local storage: image

The requests you're seeing here are used by the CMS to calculate the difference between the local version to the remote version.

There isn't a simple solution for this, as the relation widget needs to load all files to create the relation. One possible option is to use a file collection with a list widget and reference items in the list (see more here), instead of a folder collection with multiple entries. Another option is to reach out to GitLab to see if the rate limits are configurable, or if see if they can provide a way to retrieve multiple files in a single request. This might be possible via the GraphQL API, but will require a lot of effort.

Please let me know if that helps.

delwin commented 2 years ago

Hi Erez,

I have tried understanding the couple options you sent, particularly the one of using a file collection with a list widget. I've looked through and tried to figure out what you might mean by trying a number of things, without being able to get anywhere. But maybe you can explain what you mean, more in depth?

We have a folder collection because editors of the site create new items in it all the time. I thought maybe you were referencing a new type of file collection, or making a file collection dynamically from a folder collection from which we can then select, but I'm not finding a way to make this work.

So can you expand on that possibility?

We switched to GitLab because on GitHub we had one of two problems (depending on the version of CMS we pegged things to - a pre-GrpahQL version or the latest verison). One issue was the same 429 rate limit, but in a much more severe form, not allowing editors to proceed in any way. That was with GraphQL not enabled. The other issue was with the Media library, which didn't work on GitHub, but worked on Gitlab when allowing the latest (GraphQL-allowed) CMS versions to run. With those two crippling errors, we had to switch to GitLab, which, except for this one page, allows us to function very well.

Delwin

On Thu, 18 Nov 2021 at 21:06, Erez Rokah @.***> wrote:

The final 43k requests took almost 6 minutes to "finish" (many with a 429 response from gitlab). Once done I am not getting further requests when using a new dropdown to select a post except for the following requests (after clearing the network traffic from the console, to ensure I get the right requests):

Ok, so I finally was able to dig into this. I created a test repo here https://gitlab.com/erezrokah/netlify-cms-reproductions/-/tree/netlify_cms/issue_5920. It has a collection referencing another collection that has 500 entries.

What's you're seeing (with the 429 response) is GitLab rate limiting the initial load. The CMS identifies this situation and uses a backoff https://github.com/netlify/netlify-cms/blob/05e7629cf413b8fc3cfa9ee2b15b6f5ef09e549c/packages/netlify-cms-lib-util/src/API.ts#L80 mechanism to handle it (basically slows everything down). After the initial load, not only the CMS caches the requests using the browser IndexedDB (not the built-in cache mechanism my initial image shows https://github.com/netlify/netlify-cms/issues/5920#issuecomment-967133458), it also saves a local representation https://github.com/netlify/netlify-cms/blob/05e7629cf413b8fc3cfa9ee2b15b6f5ef09e549c/packages/netlify-cms-lib-util/src/implementation.ts#L511 of the git tree for that collection. That means, that in further attempts to retrieve the collection the CMS only retrieves the difference (the changes to the folder) instead of retrieving all the files. The reason for this, is that GitLab paginates listing of files, so without this optimization we'll need to retrieve all pages (in comparison GitHub lets you list 100K files in a single request).

This is how it looks like in local storage: [image: image] https://user-images.githubusercontent.com/26760571/142483748-e3490f17-620b-4b26-b86e-1d5f4e93cc9b.png

The requests you're seeing here https://github.com/netlify/netlify-cms/issues/5920#issuecomment-967224646 are used by the CMS to calculate the difference between the local version to the remote version.

There isn't a simple solution for this, as the relation widget needs to load all files to create the relation. One possible option is to use a file collection with a list widget and reference items in the list (see more here https://www.netlifycms.org/docs/widgets/#relation), instead of a folder collection with multiple entries. Another option is to reach out to GitLab to see if the rate limits are configurable, or if see if they can provide a way to retrieve multiple files in a single request. This might be possible via the GraphQL API https://docs.gitlab.com/ee/api/graphql/, but will require a lot of effort.

Please et me know if that helps.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/netlify/netlify-cms/issues/5920#issuecomment-973217330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC6GKBPW35T7S7DGMNEUZTUMVMBTANCNFSM5GQRRPNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

delwin commented 2 years ago

I have done some more testing, re-creating the repository again on Github so that the GraphQL API can be used, to check to see how this helps.

The response time to the editor is just as abysmal, to the point of being unusable.

The number of requests made to Github is definitely vastly reduced (to around 800 from around 40000 without GraphQL on Gitlab), but this doesn't seem to help with the responsiveness.

Other notes:

When going to the Collections list page first I notice that a caching message is indicated the first time going to the collections list. It seems that perhaps the entries are only cached when listing them in the Collections screen(??) When doing that I get an initial "This may take several minutes" message, and when scrolling down through each "page" of entries, a "Loading Entries..." message that takes a minute before being replaced with another page of the entries, finally taking several minutes before I'm able to get to the bottom of the list of 150 items.

However, even when these entries are cached for the Collections list screen, that cache does not seem to be used in the drop downs (the relationship widget), as using them even after the caching in the list still takes 5 minutes to populate.

So this is no better on Github with GraphQL than it is on Gitlab (where I guess GraphQL is not yet supported by Netlify CMS).

Delwin

erezrokah commented 2 years ago

Hi @delwin and sorry for the very late response, I was out of office for a week+.

For the file collection suggestion - the config will look like:

collections:
  - name: posts
    label: Posts
    label_singular: 'Post'
    folder: content/posts
    create: true
    fields:
      - label: Title
        name: title
        widget: string
      - label: 'Featured Stories⁺'
        singular_label: 'Featured Story'
        hint: '⁺will be displayed in the order provided above.'
        name: 'featured_stories'
        widget: list
        ui: fields
        required: true
        allow_add: true
        fields:
          - label: 'Story'
            name: 'story'
            widget: 'relation'
            collection: 'stories'
            file: 'stories'
            search_fields: ['stories.*.title']
            value_field: 'stories.*.title'
            display_fields: ['stories.*.title']
  - name: stories
    label: Stories
    files:
      - file: content/stories/stories.json
        name: stories
        label: Stories
        fields:
          - label: Stories
            name: stories
            widget: list
            fields:
              - label: Title
                name: title

Notice the wildcard * char to reference a list.

For differences between GitHub and GitLab when reaching rate limits - GitHub has a 1 hour window, so if you reach the limit you'd have to wait up to 1 hour for that window to reset.

As for GitHub with GraphQL - let me look into the issues, but please consider the CMS doesn't currently take full advantage of the GraphQL API. We're not grouping multiple queries into a single request as much as we should yet, and in some cases we can't use the GraphQL API at all since it doesn't support everything the REST API supports. For example it doesn't support binary blobs (like images), see https://docs.github.com/en/graphql/reference/objects#blob.

I'll port my test repo to GitHub and see if it works as expected with/without GraphQL and also check the media library issue,

delwin commented 2 years ago

Thanks @erezrokah

So that's what I thought you might mean with the file collection, but that means that every time a story is added, deleted or has its title changed (we don't use title for the relationship field because of this), we'd have to manually update the stories.json file, which isn't feasible for a manual process without errors and seems like an unwieldy process.

It is something I have considered doing, having a server process running regularly to generate such a stories.json file and commit its changes to the repository, but it really doesn't seem like an ideal solution.

Or am I missing something in your suggestion?

erezrokah commented 2 years ago

Hi @delwin, that's a good point.

For that scenario the recommendation is to use an ID widget like this one and use that field for the relation.

Would that cover your use case?

delwin commented 2 years ago

Hey @erezrokah,

As I indicated we already don't use the title for the relationship field, and we do have an id widget that is used for that purpose.

The issue I raise is that in order to make this sort of "solution" work, the stories.json file needs to be manually updated every time a story is added, deleted, or a title changed (so that the title being searched appears correctly). This cannot happen within the CMS workflow, and would need to be done outside of the CMS, so it seems very unwieldy.

lorenzode commented 2 years ago

Hi,

we have the same issue with Gitlab. We have an object consisting of an object and a list of objects which have the relation field. Opening the initial object form element takes quite some time. Each relation type has several hundred potential relation files. We implemented this way because we needed a 1:n relation.

        fields:
        - label: "Section"
          name: "section"
          widget: "object"
          required: false
          fields:
            - label: "Section title"
              name: "title"
              widget: "object"
              required: false
              fields:
              - {label: "Label", name: "label", widget: "string", hint: "The title of the content section", required: false}
              - {label: "Link", name: "link", widget: "string"}
            - label: "Content"
              name: "content"
              widget: "list"
              allow_add: true
              fields:
              - label: "Article"
                name: "article"
                widget: "object"
                required: false
                fields:
                - label: "Select Article"
                  name: "uuid"
                  widget: "relation"
                  collection: "article"
                  searchFields: ["title"]
                  valueField: "uuid"
                  displayFields: ["title"]
                  required: false
                - label: "Select card size"
                  name: "size"
                  widget: "select"
                  required: false
                  options:
                  - { label: "One column wide", value: 1 }
                  - { label: "Two colums wide", value: 2}
                  - { label: "Three columns wide", value: 3 }
              - label: "Podcast"
                name: "podcast"
                widget: "object"
                required: false
                fields:
                - label: "Select Podcast"
                  name: "uuid"
                  widget: "relation"
                  collection: "podcast"
                  searchFields: ["title"]
                  valueField: "uuid"
                  displayFields: ["title"]
                  required: false
                - label: "Select card size"
                  name: "size"
                  widget: "select"
                  required: false
                  options:
                  - { label: "One column wide", value: 1 }
                  - { label: "Two colums wide", value: 2}
                  - { label: "Three columns wide", value: 3 }

(Edited for clarification)

delwin commented 2 years ago

Yes, I get the impression more and more that Netlify CMS was a fun experiment that had some hope, but really only works well for small sites with very little structure, unfortunately. The reliance on git was a cool idea, but is obviously not very scalable.

lorenzode commented 2 years ago

@erezrokah we are hosting our own Gitlab instance. There are no Gitlab API limits set for authenticated users. Any other Gitlab settings I might tweak?

Thanks a lot!

erezrokah commented 2 years ago

Hi @delwin and @lorenzode 👋

@lorenzode for the performance issues with GitLab, in order to figure out the relation the CMS needs to read all the files. This is especially slow on initial load (before caching kicks it). The only way I can think to improve this is to use the GraphQL API. We're currently using GitLab REST API, which means we need to list all files (and paginate on them 100 at a time), and then read each file in a separate request. Lets say you have 1000 files in a collection, we'll need 10 requests just to list them (those need to happen serially) and then another 1000 to read the content of the files (those are done in parallel, but still...). With the GraphQL API we should be able to group multiple queries into a single request in theory. I haven't dug into GitLab's GraphQL API and what it supports, so I can't really say at the moment of the benefits. I did open a separate issue for it here, in case someone would like to dig in further.

@delwin I'm not sure I understand the need to sync the stories.json file manually. Having a file collection will let you edit that file within the CMS. Instead of a folder collection with a file per story, you'd have a file collection with a list of items, each one representing a story object.

mikewolfd commented 2 years ago

I'm seeing this happen in an environment with only 30 or so files, but many relation fields, the number of requests doesn't seem to stop. Keeps crashing chrome.

delwin commented 2 years ago

@erezrokah I will explain the need for a synchronization by starting with a more complete structure of the site, although I believe you have seen the complete structure before with a different bug:

  1. There are many "Stories" that each have a page on the web site
    • these are therefore in a folder collection so that each generates a page
  2. There are some pages in the web site that also reference these stories in a list
    • these page templates use the Relation widget to choose the selected Stories from the folder collection
    • only the Title, ID and whether there is a particular image associated with them is needed to be used by the Relation widget to help the user select items
    • These Relations widgets are in a List widget where the admin can add and remove as many as they want
  3. There are getting close to 200 "Stories" (articles) and one page in particular has about 50 of them selected in its list.
  4. There is another collection of "News" that works in a similar way to "Stories" and has the same issue, but the "News" items selected on other pages is usually a somewhat shorter list (maybe around 10)

So the issue is that the Stories are and must be in a folder collection, since they produce web pages. We cannot transform the Stories to a file collection because they are not simply used in a list on one or two pages. This means that if we want to use a file collection to make the Relationship Widget more performant, we would need to have it created on-the-fly whenever the Folder collection is updated (particularly when a new Story is added, one is deleted or a title is updated). The File collection cannot functionally replace the Folder collection

erezrokah commented 2 years ago

Thanks @delwin, that makes sense. Using a file collection will require changing the way pages are generated during the build.

The function to optimize would be this one: https://github.com/netlify/netlify-cms/blob/0053ebbb46f189469c86518eb458daa34a8fec4d/packages/netlify-cms-backend-gitlab/src/implementation.ts#L199

It's a bit complex at the moment since it does what I described in https://github.com/netlify/netlify-cms/issues/5920#issuecomment-973217330 (caching each collection locally and only getting a diff).

If we can list files and get their content using a GraphQL API query, that should boost performance quite a bit, and it could be a drop-in replacement for that function.

I wonder if at some point we'll hit these limits, but it's worth a try. I'll look at the GitLab GraphQL API and see what we can do.

delwin commented 2 years ago

Hey @erezrokah,

I have now made a couple file collections that mirror the folder collections so that we can use and test that.

BTW, we use hexo as a site generator, and I have created a small hexo generator plugin to generate the file collection files we would need (one for each type of article - only one of which I've mentioned in this thread). I have thus created file collections for these relationships to use, mirroring the folder collections, but only containing the fields necessary for the relationship widget (Title, ID). Note: This is not yet a complete solution, as I have to manually commit the changes to these file collecitons to git for them to be picked up.

The Stories file collection contains the ~150 stories in it.

Using this configuration where the relationship field picks up values from a file collection instead of the folder collection is quite surprisingly no faster when the editor is first loaded for a page (the relationship widgets aren't yet cached). One the select widget is clicked in and typing is started to search for an article, it still takes several minutes to load the widget with the choices.

One this several minutes has passed, then searches only take several seconds (2-3 at best, 5 at worst). This still seems exceeding slow for a process where only 150 titles are searched for possible matches, so I'm not sure what the widget is doing. This time is consistent across all such widgets on the page (remember the relationship widget is in a list widget, and there are at least two of these lists using the same relationships on this page).

Once the editor page is exited, however, going back to the editor for the page starts everything all over - needing several minutes to load data for the first time a relationship widget is activated, then serveral seconds after that.

So there seems to be no caching of a relationship widget that uses identical collections and search conditions between page edits, and the cache is rebuilt every time a page is edited and the time to construct it is exceeding long considering it now only needs to load one file to get the information it needs (150 records, and about 26kB in size).

BTW, I have also made sure that there is not preview for the page configured in case the preview was adding any time to it. So the "preview" is the default preview, listing values (in this case IDs).

Here is the config of the file collection and the corresponding generated collection:

Definition of the file collection:

  - name: "data"
    label: "💾 Global Settings"
    editor:
      preview: false
    files:
    - name: "story_index"
      label: "⛔ Stories Index"
      file: "source/_data/story_index.json"
      fields:
      - label: "Stories List for use in admin (not edited manually)"
        name: posts
        widget: list
        ui: posts
        required: true
        fields:
        - {label: "ID", name: "uuid", widget: "string"}
        - {label: "Story Title", name: "title", widget: "string"}
        - {label: "Type", name: "layout", widget: "string"}

Use of the file collection in a page:

        - label: "Stories"
          singular_label: "Story"
          name: "spotlight_stories"
          widget: list
          ui: fields
          required: true
          allow_add: true
          fields:
          - label: "Story"
            name: "story"
            widget: "relation"
            collection: "data"
            file: "story_index"
            search_fields: ["posts.*.title"]
            value_field: "posts.*.uuid"
            display_fields: ["posts.*.title"]

story_index.json:

{
    "posts": [
        {
            "title": "Title 1",
            "uuid": "id-1",
            "layout": "post"
        },
        {
            "title": "Title 2",
            "uuid": "id-2",
            "layout": "post"
        },
        {
            "title": "Title 3",
            "uuid": "id-3",
            "layout": "post"
        },
        ...
    ]
}
erezrokah commented 2 years ago

Hi @delwin, thank you for the additional information.

I'm currently looking into GitLab's GraphQL API to see if it provides better performance, see https://github.com/netlify/netlify-cms/issues/6034#issuecomment-995918416.

I'll update once/when I make some progress.

lorenzode commented 2 years ago

Thanks @erezrokah !

erezrokah commented 2 years ago

Hello 👋 I currently have a draft PR that uses the GitLab GraphQL to retrieve 50 files at a time (to avoid hitting query complexity limit).

There are still improvements to be made (like saving to a local cache instead of in-memory cache), but it should help speed up initial load quite a bit. If you're interested, the code is here https://github.com/netlify/netlify-cms/pull/6059/files#diff-570651114617393164ce3002abb929d59e73590014360152082e250e3075d772R468

Also, it will require some more testing to see if query complexity is impacted by the content size of files (my test repo has small files at the moment).

It would be great to get some early feedback on this, so if anyone would like to try it out at this stage, you can follow on contributing guide. See https://github.com/netlify/netlify-cms/blob/782e87c48a14937fcc7167ae5a7960a692e8054c/CONTRIBUTING.md#debugging

You'll need to have a similar config to this:

backend:
  name: gitlab
  branch: <branch>
  repo: <repo>
  use_graphql: true
delwin commented 2 years ago

I'm going to try take a look at this in the next weeks. Thanks @erezrokah

erezrokah commented 2 years ago

Initial support for the GitLab GraphQL API is released in netlify-cms@2.10.183. See the docs for more information.

Going to keep the issue open to get some feedback.

lorenzode commented 2 years ago

Thanks so much! We will test this in the new year and give feedback here.Am 28.12.2021 12:51 schrieb Erez Rokah @.***>: Reopened #5920.

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

delwin commented 2 years ago

Hi @erezrokah . Happy New Year!

So we did test this, even so much as having the team of editors try it on the live administration. Unfortunately enabling graphql caused severalan issues in a few places, so we've had to disable that:

lorenzode commented 2 years ago

@erezrokah

We tried 2.10.183 on an instance where we have this structure: List > Object > List > Objects with relation field. We host our own Gitlab. Each relation points to 50 to 100 files. I definitely see an improvement in speed when loading the form although there is still some loading time when revealing the last object widget and until the relations are displayed in the relation widget. It's not snappy but it makes this form usable for us again.

Let me know if you want me to supply additional debugging information. Many thanks!

erezrokah commented 2 years ago

Hi everyone, sorry for the delay. This issue got pushed back due to other priorities.

reported zero records found when using graphql

This is probably related to us hitting the query complexity limit. We'll need to add some error handling and fall back to retrieving less entries (the failure is probably reflected in the query response in the browser network tab).

I'll try to improve it by the end of this week.

jeremyzilar commented 2 years ago

@erezrokah might this be a related problem? https://github.com/netlify/netlify-cms/issues/6410

wing5wong commented 2 years ago

Still having this issue as well Related #4097

delwin commented 2 years ago

We still find this a crippling problem for editors, taking minutes (tens of minutes) for relationship fields (relation widgets, displayed as a type of custom drop-down) to populate in trying to select a post in a page. We have tried things like paring down the git repository, ensuring that editors have good amounts of memory for caching, etc.

In the most recent trials I have done as a developer who sees the issues on my machine, too, I have found that the issue is no longer requests being made from the browser on the initial use of a relationship field. In fact, I have always told the editors to give the CMS a few minutes (10-15 or more) to populate the cache after the first time they connect, so that all data can be cached properly. This seems to happen.

Once the cache is populated, however, the drop-down relation widget still take 10-15 minutes to populate (or more, depending on computer and browser) the first time one is used (remember, we use relation widgets as part of a list widget, so adding another relation to the list after one has already been added successfully works much faster). The second time a relation is added to the list and a portion of title typed, the drop-down populates within 5-10 seconds.

However, using the same template, or coming back to the same page within the same editing session, the initial population time returns to 10-15 minutes or more.

During all of this drop-down (relation widget) population time, there is almost no network traffic. None, in fact, other than a occasional "user" ping once every 5 minutes.

So this population time is all some sort of JavaScript processing, completely unrelated to Gitlab network traffic and responses.

roemchine commented 1 year ago

There is a possible solution for this problem. We created an own relation widget that only uses the slug as search, display and value field. It is basically a select widget that loads the options by listing all files from a folder collection. Unfortunately, there is no function provided in the props to list all files from a directory in the git repository. I called the GitLab API directly. We have 1400+ files in the referenced collection and it works within a few seconds.

Warning: This code only works for GitLab SaaS but it can be adpted to other backends.

simple_relation/SimpleRelationControl.tsx

import React from 'react';
import { Map, List, fromJS } from 'immutable';
import { find } from 'lodash';
import Select from 'react-select';
import { reactSelectStyles } from 'netlify-cms-ui-default';
import { validations } from 'netlify-cms-lib-widgets';
import { CmsWidgetControlProps } from 'netlify-cms-core'
import { API } from 'netlify-cms-backend-gitlab'

type FileEntry = { id: string; type: string; path: string; name: string };

function optionToString(option) {
  return option && option.value ? option.value : null;
}

function convertToOption(raw) {
  if (typeof raw === 'string') {
    return { label: raw, value: raw };
  }
  return Map.isMap(raw) ? raw.toJS() : raw;
}

function getSelectedValue({ value, options, isMultiple }) {
  if (isMultiple) {
    const selectedOptions = List.isList(value) ? value.toJS() : value;

    if (!selectedOptions || !Array.isArray(selectedOptions)) {
      return null;
    }

    return selectedOptions
      .map(i => options.find(o => o.value === (i.value || i)))
      .filter(Boolean)
      .map(convertToOption);
  } else {
    return find(options, ['value', value]) || null;
  }
}

interface Props extends CmsWidgetControlProps<any> {
  t: any
  setActiveStyle(): void
  setInactiveStyle(): void
}

export default class SimpleRelationControl extends React.Component<Props> {

  state: {
    initialOptions: string[]
  }

  constructor(props: Props) {
    super(props)

    this.state = {
      initialOptions: []
    }
  }

  isValid = () => {
    const { field, value, t } = this.props;
    const min = field.get('min');
    const max = field.get('max');

    if (!field.get('multiple')) {
      return { error: false };
    }

    const error = validations.validateMinMax(
      t,
      field.get('label', field.get('name')),
      value,
      min,
      max,
    );

    return error ? { error } : { error: false };
  };

  handleChange = selectedOption => {
    const { onChange, field } = this.props;
    const isMultiple = field.get('multiple', false);
    const isEmpty = isMultiple ? !selectedOption?.length : !selectedOption;

    if (field.get('required') && isEmpty && isMultiple) {
      onChange(List());
    } else if (isEmpty) {
      onChange(null);
    } else if (isMultiple) {
      const options = selectedOption.map(optionToString);
      onChange(fromJS(options));
    } else {
      onChange(optionToString(selectedOption));
    }
  };

  async componentDidMount() {
    const { field, onChange, value } = this.props;
    if (field.get('required') && field.get('multiple')) {
      if (value && !List.isList(value)) {
        onChange(fromJS([value]));
      } else if (!value) {
        onChange(fromJS([]));
      }
    }
    const user = JSON.parse(localStorage.getItem('netlify-cms-user'))
    const directory = field.get('folder')
    if (!!user && user.backendName == 'gitlab' && !!directory && this.state.initialOptions.length == 0) {
      const api = new API({
        token: user.token,
        branch: field.get('branch'),
        repo: field.get('repo'),
        squashMerges: false,
        initialWorkflowStatus: '',
        cmsLabelPrefix: '',
        useGraphQL: false
      })
      //console.log(api)
      const response: Promise<FileEntry[]> = api.listAllFiles(directory)
      const result = await response
      //console.log(result)
      this.setState({
        initialOptions: result.map(it => it.name.replace(/\.[^/.]+$/, ""))
      })
    }
  }

  render() {
    const { field, value, forID, classNameWrapper, setActiveStyle, setInactiveStyle } = this.props;
    //const fieldOptions = field.get('options');
    const isMultiple = field.get('multiple', false);
    const isClearable = !field.get('required', true) || isMultiple;

    const options = [...this.state.initialOptions.map(convertToOption)];
    const selectedValue = getSelectedValue({
      options,
      value,
      isMultiple,
    });

    return (
      <Select
        inputId={forID}
        value={selectedValue}
        onChange={this.handleChange}
        className={classNameWrapper}
        onFocus={setActiveStyle}
        onBlur={setInactiveStyle}
        options={options}
        styles={reactSelectStyles}
        isLoading={this.state.initialOptions.length == 0}
        isMulti={isMultiple}
        isClearable={isClearable}
        placeholder=""
      />
    );
  }
}

simple_relation/schema.ts

export default {
  properties: {
    multiple: { type: 'boolean' },
    min: { type: 'integer' },
    max: { type: 'integer' },
    folder: { type: 'string' },
    branch: { type: 'string' },
    repo: { type: 'string' },
  },
  required: ['folder', 'branch', 'repo'],
};

cms.ts

import CMS from 'netlify-cms-app'
import NetlifyCmsWidgetSelect from 'netlify-cms-widget-select';
import SimpleRelationControl from './simple_relation/SimpleRelationControl';
import simpleRelationSchema from './simple_relation/schema';

// Initialize the CMS object
CMS.init()

// @ts-ignore
CMS.registerWidget('simple_relation', SimpleRelationControl, NetlifyCmsWidgetSelect.previewComponent, simpleRelationSchema)

usage in config.yml

...
fields:
- label: Source
  name: source
  widget: simple_relation
  folder: *sourceFolder
  repo: *backendRepo
  branch: *backendBranch
...