TYPO3-Solr / ext-solr

A TYPO3 extension that integrates the Apache Solr search server with TYPO3 CMS. dkd Internet Service GmbH is developing the extension. Community contributions are welcome. See CONTRIBUTING.md for details.
GNU General Public License v3.0
136 stars 247 forks source link

Do not differenciate sites by "domain" but use TYPO3 sites for this #3459

Open baschny opened 1 year ago

baschny commented 1 year ago

Currently EXT:solr considers a "site" being the "same domain". This comes from the legacy of using sys_domain records. In TYPO3 since v9 we have a different concept of "site", being having a common "root PID". The domain is just part of the "Base URL Prefix" when generating URLs, so for example these could be different sites:

The Solr Extension would consider main and site-2 to be the same site, because it only looks on the domain to calculate the siteHash, which is the major factor upon deciding which results to display to the user in the frontend (thus it will mix search results for these two sites.

So this is basically a bug report, but also a feature request, because if we change this concept, probably some things (or sites relying on this "misbehaviour") might break - but on the long run this would be helpful so that we know what we are dealing when talking about a "Site".

Maybe I have also just overseen something, if so, please correct me if I am wrong!

avogt1701 commented 1 year ago

That would also fix some use cases with baseVariants. See #2846 #2578

baschny commented 1 year ago

Reading through these other issues:

Maybe a generic solution to cope all "wishes" would be to have it configurable which information is used to generate the "site hash":

You only have to decide which is the "default" one and document what it means when this is changed.

dkd-kaehm commented 11 months ago

@baschny @avogt1701 @christophlehmann Is maybe a new PSR-14 Event a best choice instead of programming all the strategies/variants with settings?

avogt1701 commented 11 months ago

Yes i think that can be a suitable solution. Maybe something like this:


SiteHashService Introduce new method SiteHashService->getSiteHashDomain with new PSR-14 event. Someting like this:

public function getSiteHashDomain(\TYPO3\CMS\Core\Site\Entity\Site $typo3Site): string
{
    // todo: Add an event here that can be used to manipulate the "domain" resolution

    // current default
    return $typo3Site->getBase()->getHost();
}

Replace the "domain" resolution with the new method

SiteRepository->buildTypo3ManagedSite https://github.com/TYPO3-Solr/ext-solr/blob/12.0.0/Classes/Domain/Site/SiteRepository.php#L227

$siteHashService = GeneralUtility::makeInstance(SiteHashService::class);
$domain = $siteHashService->getSiteHashDomain($typo3Site);

SiteHashService->getDomainByPageIdAndReplaceMarkers https://github.com/TYPO3-Solr/ext-solr/blob/12.0.0/Classes/Domain/Site/SiteHashService.php#L106

$domainOfPage = $this->getSiteHashDomain($typo3Site);

SiteHashService->getDomainListOfAllSites https://github.com/TYPO3-Solr/ext-solr/blob/12.0.0/Classes/Domain/Site/SiteHashService.php#L91

$domains[] = $this->getSiteHashDomain($typo3Site);
christophlehmann commented 11 months ago

We also need to care about the last eiD-Script. This uses the domain and the eiD Middleware is before Site resolver Middleware, so we have no site at this point. Thus the Script needs to be turned into a Middleware behind Site resolver.

A core-friendly site hash strategy as upcoming default would be nice. From my point of view it could be Site::$identifier + TYPO3_CONTEXT.

sorenmalling commented 7 months ago

@baschny What has been your solution up until today?

sorenmalling commented 7 months ago

My take on a solution with what we already have present: https://gist.github.com/sorenmalling/15f2e4ba7f9c9bff19592da3f060443c

baschny commented 7 months ago

@sorenmalling one potential solution is to index the site identifier too:

plugin.tx_solr.index.queue.pages.fields {
...
    siteIdentifier_stringS = TEXT
    siteIdentifier_stringS.data = site:identifier

And use it in the filter:

plugin.tx_solr {
    search {
        query {
            filter {
                # restrict search results to the current site
                currentSite = TEXT
                currentSite.data = site:identifier
                currentSite.wrap = siteIdentifier_stringS:"|"
            }
        }