Open wgmueller1 opened 7 years ago
We could have Subject of the research, the domain in which the research was conducted.
@souravsingh do you have any ideas for this? we would probably need some additional work for this. for the pubmed dataset, we have journal name, but don't have categories or keywords (that I know of)/
I think the PubMed contains a tag called MeSH Major Topic, we could use that.
I'll work on a feature extractor and include the following (add to list if you have other ideas).
We need to identify features of interest and extract them from each article. This may necissitate bringing in additional data. For example, Impact Factor may be useful. I have access to Web of Science and can download the impact factors for each year, but need to identify the date of publication for each article.
What other information / features are people interested in?