uribo / yondaronbun

読む・読んだ論文のメモ
0 stars 0 forks source link

Jaafar_2021: New Arabic Medical Dataset for Diseases Classification #7

Open uribo opened 3 years ago

github-actions[bot] commented 3 years ago

Information

:page_with_curl: Title: New Arabic Medical Dataset for Diseases Classification :busts_in_silhouette: Author: Jaafar Hammoud, Aleksandra Vatian, Natalia Dobrenko, Nikolai Vedernikov, Anatoly Shalyto, Natalia Gusarova :link: URL: http://arxiv.org/abs/2106.15236v3 :date: Submitted: 2021-06-29 10:42:53 (Update: 2021-07-05 12:41:21)

Abstract

  The Arabic language suffers from a great shortage of datasets suitable for
training deep learning models, and the existing ones include general
non-specialized classifications. In this work, we introduce a new Arab medical
dataset, which includes two thousand medical documents collected from several
Arabic medical websites, in addition to the Arab Medical Encyclopedia. The
dataset was built for the task of classifying texts and includes 10 classes
(Blood, Bone, Cardiovascular, Ear, Endocrine, Eye, Gastrointestinal, Immune,
Liver and Nephrological) diseases. Experiments on the dataset were performed by
fine-tuning three pre-trained models: BERT from Google, Arabert that based on
BERT with large Arabic corpus, and AraBioNER that based on Arabert with Arabic
medical corpus.

Article metrics

cited_by_posts_count cited_by_tweeters_count cited_by_accounts_count last_updated score
12 6 6 2021-07-06 01:40:52 1.5

http://www.altmetric.com/details.php?citation_id=108404669