Schweriner / tgm_copyright

7 stars 13 forks source link

Language Handling for Image sitemap? #29

Open neotrace opened 5 years ago

neotrace commented 5 years ago

Hi, not sure this is an issue or if I am just missing something. I have a multitree, multidomain typo3 for a client with 6 languages. We use realurl and have /en, /es, etc. for the languages. Each language has individual images named in its native language. We use tgm_copyright for just one of our 4 pagetrees.

When I look at the sitemap via DOMAIN/?type=1458065166 I see the languages mixed in the Image sitemap. For example in a german I see images from several of the other languages. If I check the source of the page, none of the other language images are there, just the correct ones for that language.

How is language handling supposed to work for this extension? I tried DOMAIN/en/?type=1458065166 and DOMAIN/?type=1458065166&L=1 but that gave totally weird results with urls from other domains / pagetrees appearing in the image sitemap.

Schweriner commented 5 years ago

@neotrace thanks for your ticket. Indeed languages are currently not respected in the sitemap. I will try to add this feature asap but I've got alot to do these days.

neotrace commented 5 years ago

We would be interested in sponsoring such a feature if this would speed up things. If you could give an estimate and timeframe I could talk to the client.

Schweriner commented 5 years ago

@neotrace you need 9 LTS support for the client too, or only the correct language in the sitemap?

neotrace commented 5 years ago

We are currently still on 8 LTS and plans are to switch in 2020 at end of life. So no, we do not currently need it!

Schweriner commented 5 years ago

@neotrace Fixed and released to TER. Please notice that only real localized file references are used in the sitemap. If some content on the site is not translated and the content is configured to fallback to the default language, then their file references will not appear in the sitemap of other languages. Maybe I will handle this situation with a later release and I hope its enough for you at the moment. This fix was done in some minutes but if you still feel happy to sponsor this, feel free: https://www.paypal.me/TollPaul

neotrace commented 5 years ago

Sorry for the late feedback. Unfortunately that does not work for us since the client did not localize in the Filelist but used different images in the localized content elements.

At least now in the sitemap the mix of languages is gone for us but I only see with german URL's now. The other 5 languages are missing completely. I guess this is because this is our default language and we do not have localized file references. We will have to find another solution, which is a shame since in principal the sitemap looks excellent even with tags.

Let me know if I can help with testing if you do decide to go further at some point.

Schweriner commented 5 years ago

@neotrace the Filelist does not have to become translated. Its all about the references of other records. Please open the Sitemap typeNum page with the related L parameter to see the images of other languages. Like: ?type=1458065166&L=1

neotrace commented 5 years ago

I did also try the rootpage with index.php?id=151&type=1458065166&L=3 but then I get results from one of the other pagetrees/websites inside my sitemap. The language is handled correct now, everything is Italian since that is L=3 on both websites. But images from website B should not be in the sitemap for website A of course. These images are also only referenced on website B, I checked. And the extension template is also only included in website A. No idea why these images show up. Can i somehow restrict the plugin to just rootpage 151 and below?

Schweriner commented 5 years ago

@neotrace please check out the Typoscript Constant Editor for the extension. There is a setting for this.

neotrace commented 5 years ago

Of course, forgot about that since it was the first thing I did. Sorry. The problem is the correct rootline is defined, but I still see locations from 2 other rootlines in the sitemap. Only a few though, not all pages of those rootlines. They stand out since they have different domains. Any idea where this could be coming from?

Schweriner commented 5 years ago

@neotrace please check out some details of those records. Search for the sys_file_reference record in the datebase and checkout if the PID of those records is the same of their parent records. Maybe the records having a wrong PID which is located in the other rootline.

neotrace commented 5 years ago

I'm afraid I don't understand what to search for exactly. I Have an example with two images from the other rootline appearing under a <loc> of the other rootline in my sitemap but with the correct domain. Also note the mix of two languages for the images here still although I request type=1458065166&L=1, which should be only English:

<url>
    <loc>https://www.wrongrootline_domain/en/news/news-detail-view/</loc>
    <image:image>
        <image:loc>
            https://www.correctrootline_domain/fileadmin/img/header/header-small-3.jpg
        </image:loc>
        <image:title>Current topics.</image:title>
        <image:caption>GROUP NEWS</image:caption>
    </image:image>
    <image:image>
        <image:loc>
            https://www.correctrootline_domain/fileadmin/img/header/header-small-3.jpg
        </image:loc>
        <image:title>Aktuelles.</image:title>
        <image:caption>Neuigkeiten rund um die Gruppe</image:caption>
    </image:image>
</url>

If I look at sys_file_reference both have the same pid which is the page listed under and this page is indeed in the wrong rootline. But the <loc> is already in the wrong rootline.

I don't understand what you mean with the parent records. Which fields should I compare in which tables?

Schweriner commented 5 years ago

Now checkout the news record of this bad record. Hover the image and it will show you the sys_file_reference UID. Then check out this record in your database using phpmyadmin or adminer. Does this sys_file_reference record has the same PID like the news record?