MassBank / MassBank-web

The web server application and directly connected components for a MassBank web server
13 stars 22 forks source link

sitemaps broken #244

Closed tsufz closed 4 years ago

tsufz commented 4 years ago

@meier-rene, the site maps look broken to me. There are missing slashes and why does it not generate in the general sitemap schema https://www.sitemaps.org/protocol.html as before? The entries could not be fetched: image

See: https://massbank.eu/MassBank/sitemap/sitemap1.xml

Furthermore, the common name for the sitemap index is sitemapindex.xml and not sitemap-index.xml (https://www.sitemaps.org/protocol.html#index)

I think, we should be compliant with the schema to quarantee that all crawlers find the sitemaps. I dunno if they work with wildcards to find sitemaps. I only maintain Bing and Google.

tsufz commented 4 years ago

I was actually wondering why Google is delivering the old presentation and not the new one: https://www.google.com/search?client=firefox-b-d&q=diclofenac+massbank https://massbank.eu/MassBank/RecordDisplay.jsp?id=EA020111.

Now, it is clear, that the cause is the broken sitemap.

meier-rene commented 4 years ago

Hi @tsufz , I fixed this issue, created a new release and merged the new release to massbank.eu branch. Please roll out whenever you have time. Make sure you adjust the settings in the google search console after rollout. Here comes a short explanation:

Please make a complete rollout with database, because there were some changes in the database due to #242.

Please close if everything is fine.

tsufz commented 4 years ago

Hi @meier-rene, thanks for the update! Is the name sitemap.xml really working? Regarding to sitemaps.org, sitmemapindex.xml is the official index for many single sitemaps. https://www.sitemaps.org/protocol.html#sitemapIndex_sitemapindex

tsufz commented 4 years ago

@meier-rene, the sitemaps do not yet working properly. I really don't understand why the index file is not named sitemapindex.xml and why the sitemaps itself do not contain the scheme tags? I think the sitemaps generated are failing. Please follow the conventions. Everything worked with the old sitemaps before the implementation in Java. image image

The sitemaps can be fetched: image

meier-rene commented 4 years ago

Hi @tsufz , first regarding the name: I can call the sitemap index whatever you want, but I really can not find a official naming scheme. https://www.sitemaps.org/protocol.html#sitemapIndex_sitemapindex doesn't even mention a sitemapindex.xml. I don't think, that the problems have anything to do with naming, but I will change it to sitemapindex.xml and try again.

Besides that, google should procss the sitemaps, but it also fails in some way on our dev server. image

I don't know whats the problem at the moment.

tsufz commented 4 years ago

Hi! I am a little bit wondering, all schemas are on that page: image image Best, Tobias

tsufz commented 4 years ago

@ermueller implemented this scheme once in MassBank a we had an R script running it.