w3c / wai-website-design

WAI Website Design and Redesign
3 stars 7 forks source link

Smart Search - 1. options / how to do #6

Open James-Green opened 7 years ago

shawna-slh commented 7 years ago

Brief description is at: What is it?

Discussed in EOWG telecon 5 August: item 2, item 13

yatil commented 7 years ago

I tinkered with the minutes and item 13 is now really item 12. :-)

ljm22 commented 7 years ago

Hi all! Shawn, thanks for the invitation to join in, so some questions:

yatil commented 7 years ago

Hi @ljm22, let me try to answer the questions:

  • is this for WAI content only? Or all W3C?

This is just for WAI content and some TR documents. It is probably limited to 300 or so pages all in all (more if we decide to include Wiki pages – Shawn?)

  • Are you committed to a SOLR/Nutch solution

We are exploring options at the moment and are not committed to anything. We don’t yet know what systeam resources we can get, so the simpler the better, I think.

  • where is the user searching from? On-site search? External search engine? Both? Do you have numbers (e.g. percentage split)?

We got a good percentage of external site search but mostly for obvious terms – like web accessibility. We have however realized that a lot of people search for and find the search link on the bottom of our pages and click on it. The goal is to give users an easier way to search, bundling up some pages to clusters of information (so that if someone searches for “alt attribute” they don’t get only every page of the Images tutorial but a link on top that clearly features the images tutorial as a whole).

  • if external, which serch engines to be given consideration first? Google, Bing, Baidu, Yandex, Naver?

We currently have a google site search, which works relatively well but does not provide us with the “smart” functionality we’d like to see. W3C homepage just recently adapted DuckDuckGo for searching. Probably we could have the clustered information as something separate and use an API to pull in data from DDG… We’re open to everything :-)

  • how will you measure success? I assume you still have no analytics? What about comparing on-page survey ('did you find this content useful') before and after?

We expect the search to be an integral component of the new redesign, I don’t think there is a lot of value in comparing before and after numbers. The main goal is to lead users to valuable and current resources efficiently, be it through navigation or search.

I personally think that a survey like you proposed would be useful non the less to see how people value our resources.

shawna-slh commented 7 years ago

Hi Liam! Some additions to Eric/yatil's info:

Thanks! ~Shawn

yatil commented 7 years ago

Here are some options that @vuxcaleb researched earlier and I got via email. Listing them here, with some quick comments:

Another question is, depending on how prominent the search is implemented, how we make sure that our solution is performing well. Maybe we need a hybrid approach — Text based search from above, plus a (custom?) directory of resources that is then displayed alongside the search.

lakeen commented 7 years ago

Chris Thatcher lead engineer with the Library of Congress search team will join our call on the 24th and 31st also.

I sent him a link to the call in information and cc'd him here.

Thanks, Laura


On: 18 August 2016 08:23, "Eric Eggert" notifications@github.com wrote:

Here are some options that @vuxcalebhttps://github.com/vuxcaleb researched earlier and I got via email. Listing them here, with some quick comments:

Another question is, depending on how prominent the search is implemented, how we make sure that our solution is performing well. Maybe we need a hybrid approach — Text based search from above, plus a (custom?) directory of resources that is then displayed alongside the search.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/w3c/wai-website-design/issues/6#issuecomment-240706824, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ARvIFB4bsTDJoB0-E0SQsj3YjE5SzaF4ks5qhE54gaJpZM4JWa9_.

thatcher commented 7 years ago

Sorry I missed the 24th. Will be on today 1pm Eastern!

thatcher commented 7 years ago

Here are some possible technologies to consider not listed above:

Project Blacklight provides a lot of smart search features but you have to host the solr index.

Elasticsearch is a search index you can purchase through AWS. You still have to crawl your content and put it in the index and write the html interface for search results.

ljm22 commented 7 years ago

Also Apache Solr/Lucene/Nutch http://lucene.apache.org/solr/

yatil commented 7 years ago

This is DDG’s developer documentation: http://docs.duckduckhack.com

ljm22 commented 7 years ago

Who do I need to chase for piwik access? @eric ?

yatil commented 7 years ago

@ljm22 You should receive an email any minute now ;-)

ljm22 commented 7 years ago

PIWIK

Thanks! Any chance of getting into the Piwik dashboard (and API) or is that out of bounds?

Google Search Console info

Can we get the attached html file unzipped and set at the root of either www.w3.org/WAI/ or just straight in at www.w3.org/? It's blank, just a verification proof so I can get at Google Search Console data.

google85be2fa2d80976f2.zip

It works if http://www.w3.org/WAI/google85be2fa2d80976f2.html results in a 200 OK and a blank page.

WAI Sitemap crawl from treecreeper

Treecreeper is our proprietary ontological site mapper. Output as a mindmap file (view in freemind etc. I can do a (fairly large) pdf if you prefer.) Is that what we want to concentrate on? Anything obviously missing? WAI-sitemap-v2.zip

James-Green commented 7 years ago

Yes.

[Trimmed email overhead – Eric]

vuxcaleb commented 7 years ago

Liam – Please reach out to Eric for access to the analytics.

Best, Caleb

[Trimmed email overhead – Eric]

yatil commented 7 years ago

@James-Green & @vuxcaleb I already reached out to Liam, see my reply above. I am following this thread.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.