Closed nniiicc closed 2 months ago
@nniiicc -- I have decided to drop all of the articles with empty abstracts, because it can be hard to tell with just title will document in the coding notes/protocol I would drop the null values in the in the "abstract" column in the dedup_clean.csv before combining it with your previously coded dataset, because the capitalization of each column label in not exact in both datasets.
@sarah114tran - documenting data collection and some preliminary cleaning of the dataset here.
Two new directories in the repo:
Next steps:
dedup_clean.csv
with our previously coded dataset from ASReview (where we agreed on what to include and not include)Query:
For Google Scholar query:
Total of 3474 observations
Applying inclusions and exclusion criteria
Date Removed out of range - 661
Deduplication: Original number of rows: 2813 Number of duplicates identified by title: 959 Number of rows after title de-duplication: 2241 Number of duplicates identified by DOI: 473 Final number of rows after DOI de-duplication: 1802
Crossref does not allow for search by type... so clean by type (removed following)
Final dataset: 1692