Coleridge-Initiative / RCPublications

Creative Commons Zero v1.0 Universal
1 stars 1 forks source link

feedback: USDA needs to add "ARMS" dataset #194

Open ceteri opened 4 years ago

ceteri commented 4 years ago

Updates from Julia -- good to reassign among others on the team:

USDA needs us to add the "ARMS" dataset for their review in mid-March:

Wwe'll need to research which publications are linked. I've checked a few sources already and that looks reasonably rich. Here are several hundred candidates for Agricultural Resource Management Survey full-text search:

ernestogimeno commented 4 years ago

@andrewhnorris already added "ARMS" dataset in datasets.json.

As a starting point, I'm creating a datadrop using "USDA Agricultural Resource Management Survey" as search term -which returns ~140 unique titles- and breaking it down into 5 chunks of work, to do it in parallel.

ernestogimeno commented 4 years ago

@ceteri I realized that the federated search script does not includes journal in the datadrop, so that information is not included in the new partitions.

  1. Should I fix that in federated search?
  2. Is that an issue for the already generated partitions?
ceteri commented 4 years ago

No prob! Given either a doi or a good match on title, the journal is simple to retrieve later in the workflow.

ernestogimeno commented 4 years ago

Trying to came up with new search terms for generating new datadrops I access the dataset website and they have this page Uses and Publications where they reference only ERS (internal) use of ARMS.

Is there any value for including those in our KG at this point?

ceteri commented 4 years ago

Good find! Are all of those publications included, from https://www.ers.usda.gov/data-products/arms-farm-financial-and-crop-production-practices/uses-and-publications/#pubs ?

We should make sure we have coverage on those.

ernestogimeno commented 4 years ago

I tried with a few searches using our federated search -which only processes openAIRE, dimensions and PubMed at this time- and I could not find any of those reports.

Should we include authors in the partition for these ones in case they are not retrieved from any API we are using?

Do we also need to include those under "Other Material Highlighting ARMS" subtitle?

ernestogimeno commented 4 years ago

@ceteri In a couple of publications I found mentions to a "predecessor" dataset:

Data on farms and farm operator households were obtained from USDA’s Agricultural Resource Management Survey (ARMS) and its predecessor, the Farms Costs and Returns Survey (FCRS). The FCRS first reported comparable data in 1989, so we used that year as our starting point.

(source http://dx.doi.org/10.22004/ag.econ.34089)

I added Farms Costs and Returns Survey as an alternative title, but I'm not sure if that is the best way to handle this case. Should I add it as a different dataset?

ceteri commented 4 years ago

My sense is yes, that datasets have entirely different instances over time and typically get called as distinctly different data. Clayton may have a different view, which is good to check about -- but for now let's have a different dataset for FCRS.

ceteri commented 4 years ago

reopening, to make sure we can capture the points above