GSA / datagov-wptheme

Data.gov WordPress Theme (obsolete)
https://www.data.gov
Other
1.88k stars 411 forks source link

Showcase featured datasets quarterly #741

Open rebeccawilliams opened 8 years ago

rebeccawilliams commented 8 years ago

Data.gov should showcase featured datasets. GeoPlatform does this, LA does this, and it just makes sense.

This is how you could get it done:

  1. Every quarter CFO-Act agencies should report featured datasets to Data.gov (as part of the CAP goal). It wouldn't have to be never before collected or structured data (though those would certainly be highlights), but any dataset that the agency has been generating press about. For example:
    • New data mashups in the news like: College Scorecard or the AFFH data
    • Routine releases featured in the news and press like: statistical data releases
  2. Data.gov should tag these datasets as featured and display them prominently, options include:
    • Above/below topics on Data.gov's homepage
    • Listed/starred above the other most viewed datasets on the data page
    • On a new page, etc

I think a spreadsheet between the OMB IDC and Data.gov - and then a script by Data.gov to tag these datasets would be the quickest way to get this up and running, but automating the spreadsheet handoff or having agencies fill out a form on Data.gov or having agencies tag these datasets every quarter in their data.json are things to consider. Also linking to the statistical agency data releases via their RSS feeds would be one way to automate part of this showcase, but hopefully those datasets are also included in agency's data.json. If folks start coming to Data.gov for their routine statistical data needs that will increase traffic and hopefully: 1. show regular statistical data users new data they did not know about, 2. help Data.gov discern what datasets are relevant to heavy users.

rebeccawilliams commented 8 years ago

I am curious if other folks in this community have other ideas about what datasets should be featured prominently on Data.gov? And how to set up that automation? (Right now Data.gov harvests data.json files from each agency nightly)

philipashlock commented 8 years ago

Agreed. For reference, here are a variety of related efforts for prioritizing or identifying "core" datasets:

A visual look at the selection and governance process for OKFN and NGDA:

screen shot 2016-07-21 at 5 01 08 pm

fgdc

dannguyen commented 8 years ago

This is in response to a prompt by @rebeccawilliams on NICAR-L:

If you're using federal government data for your reporting, do you go to Data.gov to find and retrieve that data? If you don't, why not and what would make you do so?

I'll use data.gov when a Google search takes me there. For example, it's surprisingly difficult to find certain forms of U.S. election results data. The only place that I could find (relatively updated) county-level data was on Data.gov, thanks to whoever at USGS is open-data-gung-ho: https://catalog.data.gov/dataset/2008-presidential-general-election-county-results-direct-download

The reason why I almost never start a data search at Data.gov is because of the limited list view, which only allows for filtering/sorting by Popular/Relevance. It takes a few clickthroughs to find out if a dataset is even worth looking at. I know the represented datasets come in many different forms and formats, but even a column/filter for size of the dataset in MB, or number of fields, when such metrics are calculable, would vastly improve the discoverability of datasets.

That neither data.gov.uk nor Socrata incorporate this as a data field when listing datasets suggests that it's not a trivial fix. However, I do like how data.gov.uk at least provides filesize information when you visit the landing page for a given dataset:

https://data.gov.uk/dataset/monthly-land-registry-property-transaction-data

image

This is something that should be fairly easy for Socrata to do since so much of their data fits the single-table-CSV model, and because their API exposes both row and column count for any given dataset. However, I do like that they show (and can sort by) total number of views and last update time for datasets:

image

samanthasunne commented 8 years ago

I love the showcase idea. I'd also like to see data.gov organize search results in a way that would be useful for the searcher.

Oftentimes, I search something like "student loans" and I expect to see obvious data sets like average student loan amounts first. Instead, it's usually an unordered list of anything that has anything to do with student loans, including data that's out of date or from a small locality like a town in Texas. You can refine your search and use the filter buttons from there, of course, but to me it seems like a site with the name data.gov is trying to be like a Google - a simple one-stop search box.

I also think rewriting the names and descriptions to something more user-friendly would be helpful. This may be too much of a workload, since it looks like the names/descriptions are computer-generated. But certainly an average user - like someone who's not necessarily data-savvy - would be more likely to click on something called "Default rates on student loans" than "Federal Family Education Loan/Direct Loan Cohort Default Rates, 2011."

rebeccawilliams commented 8 years ago

Glad folks are engaged in this thread! I thought NICAR might be specifically interested in the showcase idea because it ties press/PR to open data. I think search and file size previews should probably be opened as new issues though @samanthasunne & @dannguyen. Also Dan FWIW, Data.gov is CKAN and WordPress, it hosts metadata not files, making it different than the examples you shared and more challenging.

@philipashlock I think showcasing coordinated federal programs or outside data roundups or coming up with the Core Data to showcase would be useful too, but to be really clear here: my request is that anytime an agency (or 18F, EOP, etc) is sharing a new data story with the press, that this get in the showcase hopper. The showcase hopper might even be separated from the data.json files, but there should be a showcase hopper!

ghost commented 8 years ago

If the featured datasets could be "featured" by topics, they could fit under the Data.gov topics section.