Closed ana-kuznetsova closed 6 years ago
Please note that this a general overview; thorough detailed analysis will be given after all the data obtained/collected.
Polit.ru/lectures offers a remarkable variety of public lectures given by the representatives of various scientific fields.
Site structure: not html 5 - > does not support semantic mark-down so that makes it harder to crawl
Rubric structure: the articles can be filtered by authors, topics or chronologically (the best option is to crawl the pages chronologically - up to the earliest publication - 20.12.2004)
Article structure:
<strong></strong>
).Indicator by Rambler&Co mediaholding is an info-service portal about science. It updates the latest news from the Russian and world scintific community on a daily basis. This resourse also represents polemical articles about Russian scientific system and scince and business relations (can be of great interest for our research). The crucial content for us in this resource is lectures and interviews by the famous scientists (BUT! Translated interviews of foreign scientists is also included - Should we consider them?).
Site structure: not html 5 - > does not support semantic mark-down so that makes it harder to crawl
Rubrics structure: every piece of content is tagged according to 1 of 10 topics (Astronomy, Biology, Humanities etc.). Apart from that every publication is subcategorized to News, Discoveries of Russian Scientists and Discussion Club. We can also trace further devision of the articles into smaller rubrics not filtered on site by default (their names are given either in the headtitle or in the lead paragraph, e.g. "...рассказывает сегодняшний выпуск рубрики «История науки»"). For now the most plausible strategy seems to crawl the topic pages since all the genres will be found there anyway.
Article structure:
<h1><\h1>
). Can be omitted.
Review and analyse http://www.polit.ru/lectures/publ_lect/ and https://indicator.ru/ according to the checklist:
First-person speech; Set of rubrics & headings; Layout of expert's opinion in html code (if there is one).