Closed pspanja closed 1 year ago
@pspanja as discussed internally, let's include the translation_siteaccess logic into the router as well. Example:
Let's look at the following siteaccess configuration snippet:
All three siteaccesses point to the same root location ID, and we have the translation_siteaccess list defined where we cover all languages.
Let's say we are on the german website. If we render the URL to a content that only has english translation, and there is a translation siteaccess for an english language, the URL should point to the translation siteaccess instead of the current one. If no translation siteaccesses are defined, we stay in the context of the current siteaccess (default behavior)
OK, to do added 👍🏼
Implemented handling of translation siteaccesses and updated description, please re-review.
While reviewing do check the tests cases, they were written to mimic the siteaccess config and describe the behaviour in much more detail than can be documented here in short.
Test failures are caused by Travis updating composer to 2.0, this seems to be something that should be fixed in eZ kernel.
Also, cross-siteaccess routing is now enabled by default.
Do not review yet, this needs an update for choosing the siteaccess by the best matching translation, following the discussion with @vjeran and @ivrdoljak.
@hknezevic (not related to the above):
Let's say we are on the german website. If we render the URL to a content that only has english translation, and there is a translation siteaccess for an english language, the URL should point to the translation siteaccess instead of the current one. If no translation siteaccesses are defined, we stay in the context of the current siteaccess (default behavior)
In the example you gave, this would change the basic language fallback behaviour; even if the Location URL could be generated on ger
, we would generate it on eng
. While it can be controlled by translation_siteaccesses
config, I'm not sure it will not interfere with its use for the language switcher. Do we really want such switching between siteaccess in all cases?
Here are some use cases for which I would like your feedback.
Question: In the siteaccess three
, for the Location in subtree 100
and the language ita-IT
, for which siteaccess URL should be generated for and why?
one:
content:
tree_root:
location_id: 100
languages:
- fre-FR
- ita-IT
- cro-HR
two:
content:
tree_root:
location_id: 100
languages:
- cro-HR
- fre-FR
- ita-IT
three:
content:
tree_root:
location_id: 200
languages:
- eng-GB
- ger-DE
one:
content:
tree_root:
location_id: 100
languages:
- ita-IT
- eng-GB
two:
content:
tree_root:
location_id: 100
languages:
- eng-GB
- ita-IT
three:
content:
tree_root:
location_id: 200
languages:
- eng-GB
- ger-DE
one:
content:
tree_root:
location_id: 100
languages:
- nor-NO
- fre-FR
- ger-DE
two:
content:
tree_root:
location_id: 100
languages:
- fre-FR
- nor-NO
- eng-GB
three:
content:
tree_root:
location_id: 200
languages:
- eng-GB
- ger-DE
Do not review yet, this needs an update for choosing the siteaccess by the best matching translation, following the discussion with @vjeran and @ivrdoljak.
@hknezevic (not related to the above):
Let's say we are on the german website. If we render the URL to a content that only has english translation, and there is a translation siteaccess for an english language, the URL should point to the translation siteaccess instead of the current one. If no translation siteaccesses are defined, we stay in the context of the current siteaccess (default behavior)
In the example you gave, this would change the basic language fallback behaviour; even if the Location URL could be generated on
ger
, we would generate it oneng
. While it can be controlled bytranslation_siteaccesses
config, I'm not sure it will not interfere with its use for the language switcher. Do we really want such switching between siteaccess in all cases?
I don't see why it would interfere with anything.
This was the showcase of the situation when we look at the content in the context of current subtree and current siteaccess. Translation siteaccess list has a sole purpose: if the content is multi-language, to provide means of generating URLs for different translations in different siteaccess context. If it is not defined for the current matching siteaccess or siteaccess group, then the prioritized language list is applied and we will stay in the context of the current siteaccess.
Same goes if we have a content translation that matches none of the primary languages in any of the translation siteaccesses; in this case the URL is generated in the current siteaccess context, given that the language is available in the prioritized language list of the current siteaccess.
This was the showcase of the situation when we look at the content in the context of current subtree and current siteaccess. Translation siteaccess list has a sole purpose: if the content is multi-language, to provide means of generating URLs for different translations in different siteaccess context. If it is not defined for the current matching siteaccess or siteaccess group, then the prioritized language list is applied and we will stay in the context of the current siteaccess.
My concern is the following: URL for Location with eng-GB
on ger
siteaccess will be generated for eng
siteaccess even if eng-GB
is allowed on ger
. So the siteaccess will be switched, and from my POV user wants to view this Location on ger
in eng-GB
as an exception, and stay on ger
for the rest of the content as that's his primary language.
But completely OK with me if that is desired and we can control it as needed.
This was the showcase of the situation when we look at the content in the context of current subtree and current siteaccess. Translation siteaccess list has a sole purpose: if the content is multi-language, to provide means of generating URLs for different translations in different siteaccess context. If it is not defined for the current matching siteaccess or siteaccess group, then the prioritized language list is applied and we will stay in the context of the current siteaccess.
My concern is the following: URL for Location with
eng-GB
onger
siteaccess will be generated foreng
siteaccess even ifeng-GB
is allowed onger
. So the siteaccess will be switched, and from my POV user want to view this Location onger
ineng-GB
as an exception, and stay onger
for the rest of the content as that's his primary language.But completely OK with me if that is desired and we can control it as needed.
I agree, but the moment you define the translation siteaccess with the matching language, you already make decision that some particular language has its own siteaccess context. The fallback language can be used the purposes other than just translating the content, for example, when you want to be able to load some media assets where the language doesn't matter (like images), and in cases where for some reason you don't want to apply the "always available" flag on the content.
The situation where we have two siteaccesses pointing to the exact same subtree and with the same language lists (with only the order of the languages inverted) is highly unlikely to happen, unless they are connected as a translation siteaccesses. And even then, we always pick one language and one siteaccess as a general fallback if nothing matches, and all the other siteaccesses have their own prioritized language.
I am also looking at this from the SEO perspective: we usually utilise the translation siteaccess list for another purpose, which is to generate meta alternate/hreflang URLs, describing that the same content exists in different languages on the defined URLs for the search robots. You either show the multi-language content in the same siteaccess, following the translation fallback, and do not have the distinct language variant URLs, or you have translation siteacceses defined and provide distinct different language variant URLs.
OK, clear 👍
@pspanja comment on the use cases you provided above:
For the sake of simplicity, I would go with the first siteaccess that matches the content subtree and which the non-system URL reference can be generated for.
For example, have a look what at this code snippet, where we generate a list of available content translations for the language switcher:
If you can generate the non-empty route reference in the context of another siteaccess, and if it is possible to generate the URL for this route which is not a generic system URL (%/view/content/%), then use this siteaccess for generating the URL.
All of the listed examples are edge cases which complicate the matter (for example, we do not know if the content has "always available" flag set or not). But, if we wanted to fine-tune the system to cover all possible cases and scenarios, we could solve this through further configuration options - selectively enabling/disabling cross-linking for particular siteaccess or siteaccess group, or restricting the system in a way that it only allows one-to-one siteaccess for the cross-links.
For example, we always cross-link to a single siteaccess (first matched or manually defined through the configuration) with subtree location 100, if we are generating link outside of that particular subtree. Then, we can go even deeper, and if the matched siteaccess by subtree has a translation siteaccesses list, we can try to find the correct language siteaccess using the logic that you already implemented for this PR :)
Oh, and case 3 makes no sense at all :). None of the siteaccesses have ita-IT as available language, so it wouldn't really matter which one we pick - either the content is always available so it will be available in any of them, or it is not marked as always available, in which case we would not even be able to load the location/content in the context of the siteaccess "three", correct?
The case are a bit on the edge site, but they describe some important behaviour, for example in case 1 we need to decide how to select best matching siteaccess with regard to the Location's translations (is the language list important aside from the most prioritized language?), and in case 2 we need to decide whether to link to a siteaccess one
that best matches the available Location translation, or to a siteaccess two
where this language can be shown and it's not the most prioritized language, but the siteaccess has the same most prioritized language as the source siteaccess. In case 3 we can select the best matching siteaccess from the source siteaccess prioritized languages (again not only on the most prioritized one).
Always available also needs to be handled, but that's the fallback case as it can be generated for any siteaccess matching the subtree. I mostly already solved the mechanics, just want to confirm that my reasoning about this is intuitive.
For example, have a look what at this code snippet, where we generate a list of available content translations for the language switcher:
Thanks for this, I never looked much into those parts.
This is again ready for the review, please check the test cases if the code is too dense to read: https://github.com/netgen/ezplatform-site-api/blob/ecad431e9cfbb9ee3c3c951a3d9c128267894d2a/tests/bundle/Routing/SiteaccessResolverTest.php#L27
(Description updated)
Added documentation.
This feature will be implemented in version 5.1 (https://github.com/netgen/ibexa-site-api).
This implements per-siteaccess configurable cross-siteaccess
UrlAliasRouter
.The router can be configured whether cross-siteaccess routing will be used with
ng_cross_siteaccess_routing
(showing defaults):If cross-siteaccess routing is enabled, as a first step a siteaccess will be chosen from all siteaccesses that have the given
Location
under it's tree root, configured withcontent.tree_root.location_id
option. If the current siteaccess is among those, it will be used for matching, otherwise the best matching siteaccess will be resolved from the current siteaccess' prioritized languages and available siteaccesses prioritized languages. If none matches, the siteaccesses will be checked if they allow the given Location in regard to the available languages and always available flag. They will be checked as they are listed in the configuration, meaning their order is significant.If
ng_cross_siteaccess_routing_prefer_translation_siteaccess
options is set totrue
, for the matched siteaccess the end siteaccess will be matched from thetranslation_siteaccesses
configuration. Again, current siteaccess prioritized languages will be used to determine best matching translation siteaccess, otherwise they will be checked in the order they are configured. Note that translation siteaccess will be checked only for it's most prioritized language.In case none of the above matches, current siteaccess is used as a fallback.
External (for the current siteaccess tree root) subtreess, configured for the matching side with "exclude path prefix" option, are here on the generating side handled by
ng_cross_siteaccess_routing_external_subtree_roots
option. Location that is found to be under the configured external subtree root will be always handled in the context of the current siteaccess.If the end result of resolving a siteaccess is a non-current siteaccess, the router will generate an absolute path, including the host. That means such path will be returned from both
url
andpath
Twig functions. However, the resulting URL will be checked if it matches the current request host and is not generated as absolute. In that case, the host part will be removed, and the absolute URL will be generated only when that's absolutely needed.Since
UrlAliasRouter
now has to handleContent
andLocation
objects as well, previously separateContentUrlAliasRouter
andLocationUrlAliasRouter
have been merged into it.Note
The
SiteaccessResolver
service is aggresively cached in-memory, since it must be performant when handling hundreds and even thousands of links. For that reason the code there is not very pretty and might be a bit dense to read.To do