ekansa-pubs / ekansa-pubs.github.io

Publication drafts for public review
0 stars 1 forks source link

Comment on 'click here' #25

Closed benmarwick closed 8 years ago

benmarwick commented 8 years ago

Lots of interesting stuff in here, I found your discussion of 'openness' and cultural heritage especially interesting. The CyArk story was an eye-opener!

I question the claim that commercial players dominate much of our interaction with World Wide Web. While I agree that much of the aesthetics of the internet is derived from commercial sources, I wonder for how many digital archaeology projects is the workflow dominated by open source software? I mean open source software that is not especially for digital archaeologists, but which they appropriate and contribute back to with bug fixes, etc. For example, many popular CMS, DBMS, servers, and programming languages are open source, and used widely in industry and academia. This suggests to me that the spectrum for understanding digital archaeology projects is not a line with commercial at one end and academia at the other, but a triangle with a third corner which is the economics and culture of open source software. I don't think digital archaeology should become like the open source movement, only that there are a few relevant parallels.

For example, there is an expectation that the open-source movement will continue to produce generally high quality and regularly updated software for a variety of recreational, scientific, industrial purposes for free. Firefox is probably the most obvious example. I suspect for many people, digital archaeology projects might get classified in a similar way. Once the main effort of producing the product is complete (for which some money seems justified), I think there is an expectation that a group of volunteers will maintain the project, essentially for free, because that's how it seems to work with open source software. People do it for the love of it, or because their employer supports them to work on the open source software (because it relates to a commercial project, I wonder what percentage of open source software is produced in this way, probably more than I imagine). I guess one challenge with archaeology is that there are very few employers that support work on digital projects!

I'm not sure it's totally fair to claim that unlike JSTOR, Digital Antiquity has largely adopted “open data” policies. JSTOR are unique among giant scholarly content aggregators with their 'Data for Research' service that allows a researcher to download massive amounts of journal articles in formats that are convenient for text mining (CSV or XML). To me, that's a huge bit of openness that no other major journal publisher offers with their data. I'm not completely sure who this DFR service is available to (everyone or just JSTOR subscribers), but that kind of openness with the text data is pretty remarkable by itself, and worth a mention.

One possible cause of the slow take-up of open linked data in archaeology might be the great variety of data collection methods, which in turn might be due to the lack of a Lakatosian hard core in many areas of archaeology. The wonderful PLOS paper on zooarchaeology in Turkey is a good example, I think, because zooarchaeology has a hard core of biology, and basic measurements of animal bones in archaeological sites draw on biological principles, so it's relatively easy to have comparable datasets across different projects and analysts. For other areas, such as stone artefacts and pottery, I'm not sure that there is a hard core, beyond compositional analysis and basic regionally-specific typological systems.

Failing the appearance of a hard core in many areas of archaeology, the other way I see open linked data becoming more widely adopted in archaeology is if it were normalized by becoming part of the narrow neck of the traditional research process: get a grant -> collect some data -> analyse it -> publish it in a certain type of journal -> publish it in an open linked repository. This requires at least two things to happen a broad recognition of the value of open linked data (in some fundamental way, such as a time-saver/efficiency improver/productivity booster/visibility booster/ethical necessity), and a generational change so that the people in charge of the narrow neck are the ones who recognize the value of open linked data. The publication of the Homo nadeli finds in eLife and 3d models on MorphoSource, and the massive publicity surrounding all of that, is one recent gradual changing-of-the-guard type moments, where one or two people (probably John Hawks, mostly) decide to do things differently and in a very prominent way. The watershed moment would be something American Antiquity requiring open linked data for every journal article. But that's a long way off, like most journals AA are barely interested in anything beyond the text of the manuscript. We might say they embody Derrida's famous "il n'y a pas de hors-texte"

There is a slight note of tDAR vs Open Context, with OC as the underdog, and a warning about the downside of tDAR turning in the JSTOR of archaeological data, that is, the monopolizer that leaves no room for competition. Among common-or-garden variety archaeologists, I don't think there is much awareness about the difference between tDAR and OC, and they are probably gravitating to the service provider that fulfills their obligations with the least effort, and that is being used by prominent senior archaeologists (ie. Boyd and Richerson's content bias and model-based bias). We might say the OC is before its time, and archaeologists are not ready for such a sophisticated treatment of their data.

I am guilty of depositing messy and undocumented data into repositories. My motivation is improving the reproducibility of my work, rather than ensuring its enduring value (though I should pay more attention to that too). But while most others are depositing nothing at all, I already wonder if it's worth my time to do as little as that, while my peers are dashing off another paper for a high-prestige journal, without making any data available in any form. Libraries and museums do seem like a natural home for digital data, and it's easy to make a connection between depositing specimens in a museum, and depositing data there. I agree that these relationships might lead to a bright future for raising the profile of archaeological data as a research product. The parallel I made earlier with the open source movement might be relevant here, for example a museum might provide free or low cost data hosting to researchers in a similar way that Canonical, SUSE, HP and VMware contribute staff time and money to open source projects. Or perhaps a better fit to the analogy is a publisher as the data repository, since they actually make a profit, unlike most museums.

ekansa commented 8 years ago

Dear Ben,

Sorry for my delayed response, I've been busy deploying the new version of Open Context. I'm going to digest and respond to your comments after I get done with this deployment.

Again, thank you so much for putting in the time for a careful read and for insightful comments! I'm very grateful! -Eric

benmarwick commented 8 years ago

No worries, I hope the deployment goes well!

ekansa commented 8 years ago

Hi Ben,

Again, thanks for the comments! I'm finalizing the paper now, but off-line (from Github) because of formatting issues required for submission. I've fixed the JSTOR "bug", and several other issues.

On the note about being "ahead of our time", I'm not so sure. We're really, really busy, and feel very much in demand. The problem centers on financing to support enough people to meet that demand, since we're explicitly taking a slower and more hands-on curatorial approach that resists scaling to big volume. The trick is to make sure we devote our efforts to areas where Open Context's model makes the most sense, and not in areas where tDAR's model makes more sense. I think that's in everyone's interest, but requires experience and experimentation to fully understand.

Best! -Eric