Closed tsufz closed 4 years ago
I was actually wondering why Google is delivering the old presentation and not the new one: https://www.google.com/search?client=firefox-b-d&q=diclofenac+massbank https://massbank.eu/MassBank/RecordDisplay.jsp?id=EA020111.
Now, it is clear, that the cause is the broken sitemap.
Hi @tsufz , I fixed this issue, created a new release and merged the new release to massbank.eu branch. Please roll out whenever you have time. Make sure you adjust the settings in the google search console after rollout. Here comes a short explanation:
There were two issues here. First thing is the missing slashes. I specified the slash in the massbank.conf file and apparently you gave the URL without slash. In the future the slashes will be checked and added if needed.
The naming of the sitemaps is now changed. You requested a "standard" behavior of the name of the sitemap. Its now sitemap.xml
, which is probably the most "standard" name. In our case its a sitemap index file, which defines the actual sitemap files in the location .../MassBank/sitemap/{sitemap, sitemap1, sitemap2}.xml. So please check if its working in your installation after deployment. Make sure you adjust your sitemap at google, so that their robots can find the new sitemap and remove the old sitemap location.
Please make a complete rollout with database, because there were some changes in the database due to #242.
Please close if everything is fine.
Hi @meier-rene, thanks for the update! Is the name sitemap.xml really working? Regarding to sitemaps.org, sitmemapindex.xml is the official index for many single sitemaps. https://www.sitemaps.org/protocol.html#sitemapIndex_sitemapindex
@meier-rene, the sitemaps do not yet working properly. I really don't understand why the index file is not named sitemapindex.xml and why the sitemaps itself do not contain the scheme tags? I think the sitemaps generated are failing. Please follow the conventions. Everything worked with the old sitemaps before the implementation in Java.
The sitemaps can be fetched:
Hi @tsufz , first regarding the name: I can call the sitemap index whatever you want, but I really can not find a official naming scheme. https://www.sitemaps.org/protocol.html#sitemapIndex_sitemapindex doesn't even mention a sitemapindex.xml. I don't think, that the problems have anything to do with naming, but I will change it to sitemapindex.xml and try again.
Besides that, google should procss the sitemaps, but it also fails in some way on our dev server.
I don't know whats the problem at the moment.
Hi!
I am a little bit wondering, all schemas are on that page:
Best,
Tobias
@ermueller implemented this scheme once in MassBank a we had an R script running it.
@meier-rene, the site maps look broken to me. There are missing slashes and why does it not generate in the general sitemap schema https://www.sitemaps.org/protocol.html as before? The entries could not be fetched:![image](https://user-images.githubusercontent.com/2526819/81728337-925a2380-948a-11ea-9b14-fbe8b04e2754.png)
See: https://massbank.eu/MassBank/sitemap/sitemap1.xml
Furthermore, the common name for the sitemap index is sitemapindex.xml and not sitemap-index.xml (https://www.sitemaps.org/protocol.html#index)
I think, we should be compliant with the schema to quarantee that all crawlers find the sitemaps. I dunno if they work with wildcards to find sitemaps. I only maintain Bing and Google.