openlibhums / janeway

A web-based platform for publishing journals, preprints, conference proceedings, and books
https://janeway.systems/
GNU Affero General Public License v3.0
172 stars 65 forks source link

[Preprints] Site Map #1794

Closed ajrbyers closed 4 years ago

ajrbyers commented 4 years ago

The indexing system would expect the sitemap to be linked from the robots.txt instructions page for the site itself (e.g. https://dev.eartharxiv.org/robots.txt).

Here are sitemap specs:

ajrbyers commented 4 years ago

@mwestin-googlescholar

tingletech commented 4 years ago

start at the sitemap https://github.com/BirkbeckCTP/janeway/pull/1815

tingletech commented 4 years ago

Some notes about the dev site:

For example, in the HTML source for https://pub-janeway2-dev.escholarship.org/EA/repository/view/8/ these metatags would look like:

<meta name="citation_title" content="Taking the pulse of salt-detached gravity gliding in the eastern Mediterranean">
<meta name="citation_author" content="Sian Evans">
<meta name="citation_author" content="Christopher Aiden-Lee Jackson">
<meta name="citation_author" content="Davide Oppo">
<meta name="citation_abstract" lang="en" content="We investigate early-stage salt-detached gliding using a 3D seismic dataset from the Levant Margin in the Eastern Mediterranean, where gravitational instability due to margin uplift has caused north-westward translation of the Messinian salt sheet and its Plio-Pleistocene clastic overburden. Large, NE-trending, base-salt anticlines have allowed the basinward translation to be recorded by the development of supra-salt ramp syncline basins and fluid escape pipes, the latter forming due to the leakage of gas and fluid from the anticline crests.">

More on metatags here: https://scholar.google.com/intl/en/scholar/inclusion.html#indexing

ajrbyers commented 4 years ago

I think I made the citation tag changes already.

On Tue, 22 Sep 2020, 20:16 Brian Tingle, notifications@github.com wrote:

Some notes about the dev site:

-

Will a sitemap with article-level URLs be put in place for the live site? Because of the list-based browse structure (with many navigation pages), a sitemap would be necessary to ensure comprehensive indexing. Please let me know if it would be helpful for me to send specs for a sitemap that would work well for Google Scholar indexing.

I see Dublin Core tags but not the citation_xx metatags that Google Scholar uses for indexing. Would it be possible to add these?

For example, in the HTML source for https://pub-janeway2-dev.escholarship.org/EA/repository/view/8/ these metatags would look like:

  • These could be mapped from DC.Title tags, though I noticed that these are sometimes empty on the dev site - These metatags would be listed in the order of author order as listed in the publication. These could be mapped from the DC.Creator tags you have now. - These could be mapped from DC.Date.created tags. - Question: On the live site, would the PDF be hosted on the same subdomain as the article landing pages? - These could be mapped from the DC.Description tags you have now

More on metatags here: https://scholar.google.com/intl/en/scholar/inclusion.html#indexing

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BirkbeckCTP/janeway/issues/1794#issuecomment-696925941, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7PSYVRFRKGYEEVYPBIJI3SHDZXNANCNFSM4QLZYVTA .