Open satra opened 2 years ago
(Huh, I thought I had filed an issue about this 🐱)
I know very little about this topic. Netlify or external prerendering seems like a fit for our app. A dynamic sitemap that indexes the DLPs was the original idea I was going to run down. I don't know how to evaluate the two types of approach, but perhaps @brianhelba knows something about it. I believe others at Kitware have also done this type of thing before.
I don't have any experience with SEO for SPAs, sorry.
There's some seemingly good info in this article: https://madewithvuejs.com/blog/how-to-make-vue-js-single-page-applications-seo-friendly-a-beginner-s-guide
Some tools to consider:
I would like to bring back interest to this aspect of making dandiarchive "indexed" by google and its dataset search. I think it would also be valuable so we could recommend that (schema.org / google dataset description record) as a way for other indexers (eg https://www.re3data.org/) to automate updating metadata about dandi -- now it is stone age "adjust stats in the text form" kinda approach.
May be we could even start with not necessarily full listing of dandisets but just providing overall record with stats (which we gather /request already), e.g. following example in https://developers.google.com/search/docs/appearance/structured-data/dataset but
@type: Dataset
at the level of the entire archiveall that needs to happen is to insert a markup when rendering the DLP. see related issue here: #784
the fact that we are using jsonld for our metadata makes this very easy to inject our metadata into the DLP (the instructions are at the same place as the google link that yarik shared above).
the api side doesn't have any interaction with google dataset search or google search as far as i know. i'm sure they mine it.
all we need to do is stick our dandiset metadata, which is jsonld, into our DLP generator code using script
tags in head
section.
yeap,... let's just have it done. FWIW -- here is the location where openneuro injects such a record: https://github.com/OpenNeuroOrg/openneuro/blob/ba6297fd6061e9038ba199627e06b9abe951bef1/packages/openneuro-app/src/scripts/dataset/snapshot-container.tsx#L100 .
What actually do we need to do to our dandiset metadata record to become a "proper" for google's dataset discovery... would it consume properly our @context
? should we add at least @type
?
I naively took a sample metadata record and added into the <head>
of a sample html I posted on https://neuro.debian.net/_files/testld.html and pointed https://search.google.com/test/rich-results/result?id=-9qISQt3fuenNtdosFJ3rA to it but it said that there is no reach metadat.
it expands to "@type": "http://schema.dandiarchive.org/Dandiset", so not some schema.org's Dataset, so may be that is why it refuses? (unlikely I guess).
the type is a key difference. but there are others i think. also i don't know if google supports json-ld 1.1. it didn't seem to want to expand type in their test setup. but there should be some translation possible. we may need to start with the most essential fields @id, @type, description
and then add the rest.
@alessandratrapani and I just tried googling "Recordings from medial entorhinal cortex during linear track and open exploration dandi". This is the title of an old dandiset + "dandi". Nothing came up regarding the DANDI archive. Could we put this on the roadmap? Are there any blockers?
google does index dandiarchive. see this search: recordings site:dandiarchive.org
i know this is not helpful to a user, but it demonstrates that dandi is indexed in google. the question of whether it's the most relevant response would fall in the seo optimization process, and perhaps a sitemap would help. since it already indexes it, and given the things on our plate, i would rather focus on our other features unless there are low hanging changes we can make.
Confirmed on my end:
And it makes sense that there are higher priorities right now. It would be great if anyone who knows about SEO could chime in if there are any low hanging fruit here.
Thanks for the report, Ben. This feature would certainly improve the user experience. I will add this to the backlog so that we can tackle it after our ongoing initiatives.
Interestingly, this KnowledgeSpace query is the first result returned by Google for Recordings from medial entorhinal cortex during linear track and open exploration dandi
. Curious as to what they are doing differently. Albeit, this Dandiset is at the bottom of the KnowledgeSpace query page.
At present the web ui is not well indexed by google. i've submitted a request. other options are:
also we need to inject the dandiset metadata into the html of the dandiset landing page so that google dataset search can pick it up.