acdh-oeaw / dhcr-main

Digital Humanities Course Registry Application
https://dhcr.clarin-dariah.eu/
Apache License 2.0
4 stars 0 forks source link

Search engine results do not link to the corresponding detail page #55

Open patrickakk opened 1 year ago

patrickakk commented 1 year ago

search result for a specific course links to main page, instead of to the course detail page

dietervu commented 1 year ago

For completeness as screenshot:

image

source: English google, search URL: https://www.google.com/search?client=firefox-b-d&q=Digital+Humanities+Across+Borders+course

patrickakk commented 1 year ago

The problem above could maybe be solved by creating a sitemap.

As preparation: Today I've removed the word "Sitemap" from both menu's in the public part and login area, since those menu's are not a sitemap. The word has been replaced by the logo.

Does everybody agree with the new layout ?

Public menu - before

image

Public menu - after

image

Login area menu - before

image

Login area menu - after

image

patrickakk commented 1 year ago

The change above is implemented in 2023-04. Currently blocked until 2023-03 is released. Then is can be reviewed.

patrickakk commented 1 year ago

@IvdL22 @PixlTracer Can you review this?

PixlTracer commented 1 year ago

I am ok with it!

patrickakk commented 1 year ago

@PixlTracer Ok, thanks for the reply. Then I'll move the issue back to the jul23 milestone for the original task.

@IvdL22 Please reply if you have other thoughts about the layout change.

patrickakk commented 1 year ago

Proposed solution: First create an xml sitemap and list that in robots.txt. In case the search engine results don't improve, we could look at other options.

patrickakk commented 1 year ago

@IvdL22 @PixlTracer

Can you review this?

I've create a tool that lists all public shown courses together with their last updated date in the sitemap. That are currently 192 urls. As well some static pages are listed.

Which static pages do we want listed in the sitemap = be found in a search engine? I've added the following urls:

Do we need to add or remove pages from that list?

The sitemap in xml can be found here: https://dev-dhcr.clarin-dariah.eu/sitemap.xml

Can you change the label to Done if you're satisfied?

Note for myself: Todo after review/release:
- Add to robots.txt
- Create daily cronjob
PixlTracer commented 1 year ago

elegant solution, thanks for the suggestions! I think this list should be sufficient... I guess, we could add additional links to it in case we might come across some other URL to include? I'll set the label to done

patrickakk commented 1 year ago

@PixlTracer Yes links to static pages can be added in the future. (Please mind that this is described in the code, so changes need time to a the next release.)

For this issue, I'l take on the rest now, todo:

patrickakk commented 1 year ago

@PixlTracer I've removed https://dhcr.clarin-dariah.eu/national-moderators from the sitemap, since this would also crawl and save the moderators email addresses. Please let me know if you prefer otherwise.

patrickakk commented 1 year ago

Blocked until 2023-09-13 (6 weeks from now).

Then test listing in search engines.

patrickakk commented 1 year ago

Currently waiting for availability of Google Webmaster Console.

patrickakk commented 8 months ago

Update: Google Webmaster Console is available since today.