commonsearch / cosr-back

Backend of Common Search. Analyses webpages and sends them to the index.
https://about.commonsearch.org
Apache License 2.0
123 stars 24 forks source link

Smarter page titles & descriptions #6

Open sylvinus opened 8 years ago

sylvinus commented 8 years ago

Some page titles out there are either plain wrong or unhelpful. It's way worse for descriptions.

Most other search engines take some liberty and don't just use the <title> tag as source for the title or the <meta> tags for the description.

Other sources of data could include:

Any other idea?

How to choose between these source will be a complex topic but we can build something reasonably simple in the short term. We already support a blacklist of titles and a few fallbacks in formatting.py

OriPekelman commented 8 years ago

You should prefer .. Facebook / Twitter Cards when present .. these would usually give you better results then title/description.

http://ogp.me/

https://dev.twitter.com/cards/markup