Open golnazads opened 2 weeks ago
Last year the count of feature name appearing in all the papers for a specific feature name/feature type/target was included as well. Should it be there, or should I remove it? @aaccomazzi
If I understand this question correctly, the count is superfluous because one can get it by simply counting the number of instances a particular feature appears in the CSV file (which is the number of articles where feature X appears). Am I getting this right?
A more interesting metric would be the number of times the feature appears within a given paper. With this number we could compute, if we wanted, TF/IDF for each feature in each paper, which may be a useful metric for retrieval further down the line. Is this something you can easily generate?
So instead of summing all the instances per feature name and report it, you want to see them individually, you can sum them later if you want? OK.
Export database content: for the time being, we are going to have the run.py script output a CSV file that contains: Bibcode where the feature was found (e.g. 2000M&PS...35.1043T) Target (e.g. Moon) Feature type (e.g. Oceanus) Feature name (Oceanus Procellarum) Feature id (4395) - this is found in the source data from the gazeteer and is used to create the link to the online page for it (https://planetarynames.wr.usgs.gov/Feature/4395)
As specified in https://docs.google.com/document/d/11TvdloUDrbTXS7YbPg-8YKEzYUeiMv5Xu8esszt1jI8/edit?usp=sharing by Alberto