Open EvanBianco opened 6 years ago
I like the idea of making it easier to understand what data types are included, but I'll suggest you'll very quickly get to a number of characteristics to be included that a table would make for faster visual inspection over icons, especially if the number of examples gets large?
Maybe a good way to go about this is to start a google sheet with all the characteristics you might want to know before deciding if a dataset is useful for a potential project? For example, for a well-based projects I'm usually interested in not just the data format of the logs but also the number wells, log types, what log types are present in all wells, presence of tops, presence of lithology data, distance apart, and whether they're from the same geologic formation or not. In another project, you might be really interested in sonic and whether wells and seismic are from the same area. I think one thing open-datasets in geoscience are really missing is a format or list of characteristics to include in the summary that help people understand if the dataset will work for them. Current state almost always requires you to spend hours or days digging around to know if a dataset will work for you. Geologists are not used to searching for datasets that meet their analytical needs but rather working with whatever data is available in a geographic area. Their dataset summaries reflects this.
I think this would probably not scale well on a static list.
Could rather have a nice webapp or a separate list with a proper "legend" specific to open data sets. Awesome open geodataset?
I'd like to suggest some small set of icons that we could use to indicate what kinds of data files are contained within the various open data collections.
This would require a little bit of graphic design work. Any takers?
Icons might include:
Standard formats:
Non-standard formats (this will be harder to contain)
etc.
Perhaps such a feature could be used to incorporate the notion of a standardized summary page for each dataset. Consistently documented and curated. Could even write a script to build such a summary directly from the data set itself – as a first order entrance exam to test the data quality of the dataset.