silverstripe / silverstripe-framework

Silverstripe Framework, the MVC framework that powers Silverstripe CMS
https://www.silverstripe.org
BSD 3-Clause "New" or "Revised" License
721 stars 821 forks source link

[ORM] Search form results Pagination breaks when there are results without canView permission #2791

Open mansisheth13 opened 10 years ago

mansisheth13 commented 10 years ago

Seems related: #1529

The SearchForm logic for search and permission based filtering works as below:

SearchForm.php - Lines 135 - 144

        if(strpos($keywords, '"') !== false || strpos($keywords, '+') !== false || strpos($keywords, '-') !== false || strpos($keywords, '*') !== false) {
            $results = DB::getConn()->searchEngine($this->classesToSearch, $keywords, $start, $pageLength, "\"Relevance\" DESC", "", true);
        } else {
            $results = DB::getConn()->searchEngine($this->classesToSearch, $keywords, $start, $pageLength);
        }

        // filter by permission
        if($results) foreach($results as $result) {
            if(!$result->canView()) $results->remove($result);
        }

The MySQLDatabase::searchEngine() function returns a PaginatedList, for which the number of total items is set in the searchEngine function. Lines - 932-940

        $list = new PaginatedList(new ArrayList($objects));
        $list->setPageStart($start);
        $list->setPageLength($pageLength);
        $list->setTotalItems($totalCount);

        // The list has already been limited by the query above
        $list->setLimitItems(false);

        return $list;

Since the result is filtered for canView() permissions, after the PaginatedList is created, the totalItems for the PaginatedList doesn't get reset.

As a result, the Pagination may display 5 pages, but there could be results displayed on 3 pages only.

e.g. http://lgsuper-2014.sparkinprogress.com.au/ Here we are using SecureAssets module to restrict access on a folder 'Director' to which has 300+ files. When you search for 'director' search results display about 44 pages of results, while results are only displayed on pages 1 & 2.

nglasl commented 9 years ago

+1 for this issue.

rb1t commented 4 years ago

Summarizing some discussion from Slack: The biggest challenge in solving this, is relating to how canView inheritance works for any given record (Page/File). Because it requires you to traverse up a record’s ancestry to understand its view permissions (for one or more Groups), it is not very performance-friendly (particularly for larger sites with lots of Pages/Files spanning multiple levels of SiteTree nesting/File Directories). A few candidate workarounds/solutions:

  1. Solve through the UI by using a “view more results” button/link instead of the current paginated approach. To show a list of pages: fetch some, filter, and if you don’t have enough items after filtering, fetch some more. Then, generate the subsequent “more” link based on how far you got to in the list. Your more link can take the form “?start=XXX” which would be shareable/index-able, but there is a UI challenge in seeing results prior to this (so maybe a ‘show previous results’ button, albeit not perfect).

  2. Store the canView permission directly on the record (Page/File). It would require serializing multiple user groups on the record (or setting up a junction table to map allowed Groups to that Page/File record), but in theory this could happen fairly easily when that record is saved. The bigger issue with this approach is that if the parent changes its permissions (in that File’s parent Directory or Page in Site Tree) OR the Group changes its parent Group, all of the children need to be updated. Not practical or efficient.

  3. Pre-query the database, filter it for allowed results, and then paginate through that filtered list. In other words, turn the canView filter into a DataList::filter(). This is a nice approach in theory, but in practice it's exposed to the same performance issues in the original challenge stated above (again, particularly on larger sites with lots of Files + Pages and hierarchy).

ScopeyNZ commented 4 years ago

3) assumes that permissions are deterministic from the database but that's not the case.

I think if the database was the source of truth for permissions, regardless of how it's related to the records you're trying to view then it shouldn't be too bad to write some performant queries to do what needs to be done.

The problem with this issue is that custom logic can be applied with a canView function, and it's completely unreasonable to expect the ORM to magically cache this in the database layer with proper cache invalidation (etc).

The real fix here is rewriting the ORM ACL in SilverStripe (5?) and removing canView functions - or moving them to a separate interface so they can be easily ignored if not present. We can switch the permissions entirely to the database for most objects, and then have nicer ORM defaults and procedures for fetching models that can be viewed by a given user.