modxcms / SimpleSearch

A simple search component for MODx Revolution
http://rtfm.modx.com/extras/revo/simplesearch
9 stars 19 forks source link

When limiting search to parent IDs, search is limited to current context #69

Open juro opened 5 months ago

juro commented 5 months ago

If &idType='parents' and some &ids are specified, search is limited to current context only. This is due to method processIds() in SimpleSearchDriver.php, which uses modx->getChildIds($id, $depth). MODX docs (here: https://docs.modx.com/3.x/en/extending-modx/modx-class/reference/modx.getchildids) briefly mentions that this method search only current context, if no other context is specified as additional condition in third optional param. I don't see any simple workaround here. The best would be to construct this third param for getChildIds with all contexts specified in &contexts script parameter if possible. At the moment specifing &contexts further limits results returned by processIds()

halftrainedharry commented 5 months ago

Here is a possible solution for this that uses the values from the &contexts property to run getChildIds() in a loop (until child IDs are returned).

protected function processIds(string $ids = '', string $type = 'parents', int $depth = 10): array
    {
        if ($ids === '') {
            return [];
        }

        $ctx = !empty($this->config['contexts']) ? $this->config['contexts'] : '';
        $ctx = $this->cleanIds($ctx);
        $ctxArray = !empty($ctx) ? explode(',', $ctx) : [];

        $ids = $this->cleanIds($ids);
        $idArray = explode(',', $ids);
        if ($type === 'parents') {

            foreach ($idArray as $id) {
                if (count($ctxArray) > 0){
                    foreach ($ctxArray as $ctx) {
                        $child_ids = $this->modx->getChildIds($id, $depth, ['context' => $ctx]);
                        if (count($child_ids) > 0){
                            $idArray = array_merge($idArray, $child_ids);
                            break;
                        }
                    }     
                } else {
                    $idArray = array_merge($idArray, $this->modx->getChildIds($id, $depth));
                }                
            }

            $idArray = array_unique($idArray);

            sort($idArray);
        }

        $this->ids = $idArray;

        return $this->ids;
    }

Another possible solution is to query the contexts for the specified IDs (&ids) from the database and then using the correct context for every ID when calling getChildIds(). But this requires an additional database query.

juro commented 5 months ago

Thanks @halftrainedharry for your solution. I have briefly tested it and works as expected. This solution requires to specify all &contexts where specified &ids can be located, otherwise those parent &ids are not searched. This renders subsequent filter to &contexts obsolete. So user can:

Your second solution is probably more intuitive and would work as one expect from docs:

halftrainedharry commented 5 months ago

Here is some example code for the second solution:

protected function processIds(string $ids = '', string $type = 'parents', int $depth = 10): array
{
    if ($ids === '') {
        return [];
    }

    $ids = $this->cleanIds($ids);
    $idArray = explode(',', $ids);
    if ($type === 'parents') {

        // Read the 'contexts' property
        $ctx = !empty($this->config['contexts']) ? $this->config['contexts'] : $this->modx->context->get('key');
        $ctx = $this->cleanIds($ctx);
        $ctxArray = !empty($ctx) ? explode(',', $ctx) : [];

        // Query the contexts for all the IDs
        $c = $this->modx->newQuery(modResource::class);
        $c->select('id,context_key');
        $c->where(['id:IN' => $idArray]);
        $ctxLookup = [];
        if ($c->prepare() && $c->stmt->execute()) {
            while (($row = $c->stmt->fetch(\PDO::FETCH_ASSOC)) !== false) {
                $ctxLookup[$row['id']] = $row['context_key'];
                if (!in_array($row['context_key'], $ctxArray)){
                    $ctxArray[] = $row['context_key'];
                }
            }
        }

        // Set the 'contexts' property
        $this->config['contexts'] = implode(',', $ctxArray);

        foreach ($idArray as $id) {
            if (array_key_exists($id, $ctxLookup)){
                // Specify the correct context for this ID
                $idArray = array_merge($idArray, $this->modx->getChildIds($id, $depth, ['context' => $ctxLookup[$id]]));
            } else {
                $idArray = array_merge($idArray, $this->modx->getChildIds($id, $depth));
            }
        }

        $idArray = array_unique($idArray);

        sort($idArray);
    }

    $this->ids = $idArray;

    return $this->ids;
}

Because SimpleSearch always restricts the result to the current context when no &contexts property is specified, this property has to be changed in the processIds() function (which I don't like).

https://github.com/modxcms/SimpleSearch/blob/56d692ba987aeea27ca3709dc705f1a556db71ee/core/components/simplesearch/src/Driver/SimpleSearchDriverBasic.php#L230-L233

Also with this solution, every search that uses the &ids property is now slower, because there's an additional database query that's not really necessary in most cases.

juro commented 5 months ago

If I get it right, now if &ids are defined, it doesn't matter if &contexts are defined, because processIds() extends &contexts to include contexts of all &ids. Which is probably not intended behaviour. If both &ids and &contexts are defined, they should be treated as AND condition. It's probably enough not to change &contexts inside processIds().

But if only &ids are defined (and &idType=='parents'), context filtering can be probably skipped in SimpleSearchDriverBasic.php, something like this:

if( empty($this->config['ids']) || $this->config['idType']!='parents' || !empty($this->config['contexts']) ) {
 /* Restrict to either this context or specified contexts */ 
 $ctx = !empty($this->config['contexts']) ? $this->config['contexts'] : $this->modx->context->get('key'); 
 $f   = $this->modx->getSelectColumns(modResource::class, 'modResource','', array('context_key')); 
 $c->where(["$f:IN" => explode(',', $ctx)]); 
}

Other option would be to have some special value for &contexts with meaning "all". This gives you possibility to search accross entire site (with further option to refine search on &ids children).

Our use case for all this: we have several large, multi-language multi-product sites organized into separate contexts. But not entire context should be searched, only children of "main menu" resource and homepage in every context. I know we can limit search using searchable resource setting, but this requires editors to set this property all the time correctly.

Regarding speed: while site search is done only ocassionally, it's ok to have it slower in some complicated scenarios.

halftrainedharry commented 5 months ago

Regarding speed: while site search is done only ocassionally, it's ok to have it slower in some complicated scenarios.

Other users that use the &ids property, but only have 1 context, probably feel different about that. In general, the first solution seems to be preferable for maintaining backwards compatibility.


For you current use case, you might want to use a custom driver. There is some more information about how to create one in this recent forum thread. This lets you customize the search code, without worrying that it gets overwritten on the next update of the SimpleSearch extra.

juro commented 5 months ago

Thanks @halftrainedharry , for now we will stick with the first solution, enabling us to search inside specified parent ids and contexts. I hope this solution will make it into future release, now we use just a hot fix. If necessary, I will write our own custom driver. But whenever possible, it's better to use standardized and documented features.