Closed nikdragovic closed 4 years ago
Looks like we need to add a verification code for Google webmaster tools/Google Search Console in our DNS configuration. I have a generated a code which can be referenced in Box.
I'm checking on the logistics of this. -Doug
I am verifying through the LTDS google account where we have all the other web properties which means the site verification code is different than the one in Box. I have a ticket in for the verification via DNS request INC03371987
@eporter23 I have an update on this that I need to run by you and the team. I'll add it to the dlp-launch ticket for our change management meeting discussion. -DG
I closed/canceled the ticket INC03371987 since I need to get with Emily as a next step to determine if we want to proceed with DNS verification of "library.emory.edu" + URL verification for "digital.library.emory.edu" or approach this in another way. I'll resubmit a new ticket once we determine if we are using DNS verification or not.
FYI for further discussion about Google Search Console + Google Analytics work
I see that our Google Analytics are setup for digital.library.emory.edu here https://analytics.google.com/analytics/web/?authuser=1#/report-home/a164499118w39366856p39046093
The question right now is how best to verify the property digital.library.emory.edu for Google Search Console. Unless there are other insights or objections, I'm inclined to go with domain verification for "library.emory.edu" per instructions from the Emory DNS folks and the do url verification for "https://digital.library.emory.edu" per what I've read on the web regarding subdomains and domain verification.
@libdgg from what I can see, if you have the top-level domain confirmed, you can ask google to crawl any subdomain without needing to re-validate the domain. I just tried with our console and here's the steps I'm seeing:
OPEN THE GOOLE SEARCH CONSOLE Here's the main console for our top-level domain - curationexperts.com
ENTER THE URL FOR THE SITE WITHIN YOUR DOMAIN YOU WANT TO INDEX tenejo is our demo Hyrax repository, it's reached as a subdomain of our primary domain (like digital is a subdomain of library.emory.edu) NOTE: you will need to wait to do this until HTTP Authentication is turned off
REQUEST REINDEXING FROM THE URL INSPECTION PAGE
CHECK BACK ON PROGRESS PERIODICALLY Asking google to index the site without a sitemap will take some time, so you'll want to check in periodically to make sure no issues are encountered.
You can ensure a more thorough indexing by periodically submitting a full sitemap - there's some breadcrumbs here to automate sitemap generation https://github.com/projectblacklight/blacklight/wiki/Search-engine-harvesting
The blacklight list on google groups or the Code4Lib slack organization might be able to point you to more sample code for automating the submission of the sitemap to google.
@libdgg To see more detail about google's indexing, you'll want to add the site as persistent web 'property':
SEE WHICH SITES ARE CONFIGURED
ADD A NEW PROPERTY (SITE) WITHIN YOUR DOMAIN
ADD SUBDOMAINS USING THE URL PREFIX
For example, if we wanted to start indexing our dev site...
Because you (Emory's Google Search account) already owns the parent domain, you should be good to go
CHECK OUT INDEXING COVERAGE FOR YOUR SITE Notice that we recently cleaned out a lot of sample items that folks had created over time during conferences & workshops - hence the high excluded count
Thanks Mark. I have submitted ticket INC03376253 to have the Emory DNS team verify the primary domain "library.emory.edu". Once that is done I plan to add the "digital.library.emory.edu" property per the info above from Mark.
DNS verification complete for "digital.library.emory.edu" and "library.emory.edu". The Google Search Console shows both properties so it looks like everything worked.
@libdgg Can we close this ticket?
Once HTTP auth is removed, trigger a crawl via Google Search Console or Emory-preferred workflow.
What team is responsible for this process, and can be consulted in the future? Is it possible to submit a Blacklight sitemap, and is anyone aware of the status on community developments?