processwire / processwire-issues

ProcessWire issue reports.
44 stars 2 forks source link

findJoin fails for page reference fields #1951

Open adrianbj opened 1 month ago

adrianbj commented 1 month ago

Short description of the issue

If you try to include a page reference field in a findJoin() (or with the find() and the field=page_reference_field approach you get this error: "SQLSTATE[HY093]: Invalid parameter number: number of bound variables does not match number of tokens on line: 168 in /Users/adrianjones/Sites/griefcoach.test/wire/core/WireDatabasePDOStatement.php"

Expected behavior

No error - should work as it does if the page reference field is permanently autojoined via the field settings checkbox.

Actual behavior

Fails with that error. I thought perhaps I might need something like page_reference_field.id but this doesn't work either. In the case of single PR fields, it prevents the error, but the field still isn't joined, but in the case of multi-value PR fields, the error persists.

adrianbj commented 1 month ago

@ryancramerdesign - I also wonder if we could have the join options made available for findMany() please?

adrianbj commented 1 month ago

Sorry, I am getting sidetracked here, but I am starting to wonder about the benefits of autojoining PR fields.

It seems like initially all that is loaded is an array of page IDs references by the user_type field, but as soon as you output the user_type itself, like in the example below), then it contains the full details of the referenced pages. Surely this requires an additional DB query anyway? Perhaps there are still speed advantages depending on what you're doing with the PR field, but it seems weird to me that it doesn't load the full page details of the referenced pages initially. Maybe it's simply a trade-off to ensure a PR field being autojoined won't dramatically increase the size of the main page object?

image

Here some experiments I think that are interesting. Note the execution time for each of these and that the user_type field is hard autojoined via the field flag. Note that most are find(), but a couple are findMany()

It's definitely worth thinking through what's happening in each of these, but the one comparison that really stands out to me is the difference between #1 and #4 which seems to show that the autojoin really isn't as useful as I would expect if you actually need to do anything with the PR field. It's still much quicker than #3, but if the has() didn't result in an extra DB call (because it can't use the array of referenced IDs), then I would expect it would be similar in time to #1.

image image image image image image
adrianbj commented 1 month ago

Also seeing an error with text fields when checking value of the field in selector.

"SQLSTATE[42S22]: Column not found: 1054 Unknown column 'field_limited_to.data' in 'field list' on line: 168 in /Users/adrianjones/Sites/griefcoach.test/wire/core/WireDatabasePDOStatement.php"

It works fine if the autojoin checkbox is checked - it only fails with trying to join within the find() query. And it only happens if I do something like these where I am filtering to the field being empty or not empty.

$pages->find('template=date, limited_to=, field=limited_to') or $pages->find('template=date, limited_to!=, field=limited_to')

If I just do $pages->find("template=date, field=limited_to") then it works as expected.

adrianbj commented 1 month ago

@ryancramerdesign - I think some of what I am seeing here is related to these things I posted about back in 2022: https://processwire.com/talk/topic/25088-pw-30172-%E2%80%93-find-faster-and-more-efficiently/?do=findComment&comment=228604

It's such a great feature being able to join on a query by query basis, but at the moment there are just too many problems still.

ryancramerdesign commented 1 month ago

@adrianbj I can't duplicate the issue here. Here's what I'm testing with, maybe I'm missing some part? The field "brands" is a Page Reference field. The results have a mixture of pages with 0 to 2 pages populated for the brands field.

$items = $pages->find("template=operator, field=brands"); 
foreach($items as $item) {
  echo "$item->title: " . $item->brands->implode(', ', 'title') . "\n";
}

Also tried this:

$items = $pages->findJoin("template=operator", [ "brands" ]); 
adrianbj commented 2 days ago

Hi @ryancramerdesign - sorry you can't reproduce that error - neither can I right now so it must be related to some exact selector situation, not just page reference fields in general. However, I am still seeing lots of other issues with this findJoin approach.

The first is the text field when checking if empty or not empty as noted above: https://github.com/processwire/processwire-issues/issues/1951#issuecomment-2240502838 - this should be very easy to reproduce (I just checked again).

The other current problem I am seeing is where adding a page reference field to the list of fields to autojoin will prevent another also specified field from being loaded, eg:

field=message_rating|message_priority

returns:

image

but:

field=message_rating|relationships_to_central_griever|message_priority

returns:

image

There are two problems with this - the message_rating field is no longer joined, but also look at the now 342 entries for message_priority.

In both examples, the message_priority field result makes no sense - it is a Select Options field, but why are their so many results - it only has one value in the DB so it looks like the wrong type of JOIN is being used. Note that if I join via the field's settings, then only one value is returned, so there is definitely something funky with the runtime findJoin().

Please let me know if you need any more details cause it would be really great to get these all sorted.

Thanks!