danielskatz / software-vs-data

understanding and documenting the differences between software and data in the context of citation
Creative Commons Attribution 4.0 International
32 stars 10 forks source link

Granularity #40

Closed zuphilip closed 7 years ago

zuphilip commented 7 years ago

I would suggest to add another difference about the "granularity". For example if you think about nation-wide survey data, then they will give you a lot of different categories in each year, which itself will be spread over a lot of tables. Data providers can publish each year a) the new data once, b) the new data among different collections, c) a new data dump of everything (past+present).

On the other hand it might be that I want to cite i) the data as whole, or ii) just some specific table of the data, or iii) just one value from some data.

I don't think the same "granularity" exists in software citations, also there might be different versions...

Let me know what you think about this one. I can search for some examples and references.

danielskatz commented 7 years ago

I think that this granularity does have a parallel in software. A code repository might also have text documents or even sample data inside.

In some sense, a data collection is different than just data, and a software repository is different than just source code. Both are containers that include multiple types of objects.

zuphilip commented 7 years ago

Hm... okay, I see. One could say that both data and software can have some hierarchical structure of smaller parts, which could mean for software included libraries, folders or even separate files. I guess that this is then not a difference bot something they can have both.

For completeness:

danielskatz commented 7 years ago

As something that they can both have, it's interesting but not quite appropriate for this document, so I will close this issue, but thanks for bringing it up