ana-kuznetsova / Popular-Science-Texts-Compling-research

An M.A. educational project on computational linguistics.
4 stars 3 forks source link

Review Geektimes & Cherdak #3

Closed ana-kuznetsova closed 6 years ago

ana-kuznetsova commented 6 years ago

Review and analyse и https://chrdk.ru/ и https://geektimes.ru/hub/popular_science/ according to the checklist:

First-person speech; Set of rubrics & headings; Layout of expert's opinion in html code (if there is one).

AnyaLa commented 6 years ago

geektimes.ru/hub/popular_science Source devoted to users’ reviews on popular scientific topics. Main types of articles include:

  1. Users’ reviews https://geektimes.ru/post/295115/
  2. Translations of scientific articles https://geektimes.ru/post/295131/
  3. Blogs of different companies where recent developments or interviews held by the company are presented https://geektimes.ru/company/misis/blog/263124/ https://geektimes.ru/company/vertdider/blog/291153/
  4. Interviews with scientists or professionals https://geektimes.ru/post/295129/

Articles are tagged with rubrics (Space, Physics, Astronomy, Artificial Intelligence, Ecology, Chemistry, Biotechnology, Transport of the future, Brain, Science fiction, Laser, Nanotechnology, Quantum technology).

Interviews and translations have specific tags. Interviewer’s text is marked with html styles or tags. Experts' comments in user reviews are nor tagged in html, just marked with "".

Articles are provided with comments and readers’ rates. Rates could be used toestimate popularity of different topics.

API https://github.com/thematicmedia/habrahabr_api provides different methods for content processing (comments, user profiles, metadata etc.)

nevmenandr commented 6 years ago

Does the API allows you to get the text data?

AnyaLa commented 6 years ago

Чердак chrdk.ru

Articles on popular science and technology topics written mostly by scientific journalists.

  1. News https://chrdk.ru/news/ne-dumai-o-slonah
  2. Scientific articles (long-reads) https://chrdk.ru/sci/editing_the_biosphere
  3. Interviews https://chrdk.ru/sci/new-ras-president-interview
  4. Video, photo, infographics (not considered within the project)

All text materials are divided into news and articles, and interviews are included into articles. Interviewer’s text is marked with html styles. Experts’ comments in long-reads are presented by “”.

There’s loots of tags that mark the articles (more than 100), but many of them do not present the topic (rubrics), for instance, tag ‘Kurchatov Institute’.

AnyaLa commented 6 years ago

@nevmenandr, Some methods described in documentation allow to get posts and metadata, but I haven't tried it yet