asreview / asreview-datatools

Tool to preprocess datasets for ASReview
MIT License
19 stars 13 forks source link

Query - merging RIS files #48

Open J535D165 opened 1 day ago

J535D165 commented 1 day ago

Discussed in https://github.com/asreview/asreview/discussions/1870

Originally posted by **Emanuel-1986** October 7, 2024 Hi, I have downloaded RIS files from Embase(Elsevier), CINAHL (Ebsco), and Epistemonikos. Then, I downloaded the .nbib file from PubMed and converted it to RIS using RefWorks software. To check that ASreview reads the RIS files, I used the describe argument and none of the RIS files had any missing titles, and a very small fraction had missing abstracts, but a chunk were unlabelled. When I merged all the 4 RIS files using vstack argument, the number of missing abstracts and titles increased dramatically. This also happened when I merged 2 RIS files, as shown in the table below. Can you help me overcome this? Am I doing something wrong with the code when using vstack? Regards, Emanuel Database / merged file | Missing abstracts | Missing titles | Unlabelled | Number of records -- | -- | -- | -- | -- PubMed | 1 | 0 | 1721 | 1721 Embase | 247 | 0 | Null | 7762 CINAHL | 108 | 0 | 1405 | 1405 Epistemonikos | 10 | 0 | Null | 907 Pubmed + Embase | 1968 | 0 | 9483 | 9483 CINAHL + Epistemonikos | 118 | 1405 | 2312 | 2312 All 4 merged | 4280 | 10888 | 11795 | 11795 Code: cd C:\Users\Schembri\Desktop asreview data describe pubmed.ris asreview data describe Embase.ris asreview data describe CINAHL.ris asreview data describe epistemonikos.ris asreview data vstack output/merged_RIS_file.ris Pubmed.ris Embase.ris CINAHL.ris epistemonikos.ris asreview data describe output/merged_RIS_file.ris **Version information** - OS: Windows - Browser: Chrome - ASReview version: 1.2.1
J535D165 commented 1 day ago

We will address this issue in https://github.com/asreview/asreview/pull/1820. However, a fix to datatools is still welcome.