kbroman / dataorg

Tutorial on organizing data in spreadsheets.
http://kbroman.org/dataorg
Other
36 stars 15 forks source link

Delineate use cases clearly #4

Closed iangow closed 8 years ago

iangow commented 8 years ago

Off the top of my head, I can think of four scenarios:

  1. I am doing data entry myself.
  2. I am not doing data entry myself, but I have control over the process.
  3. I am not doing data entry myself, but I have influence over the process.
  4. I am not doing data entry myself and I have no influence over the process.

As I read the advice on this site, I feel that sometimes it applies better to one use case than the others.

For example, in Case 4, the advice should be "do as much as possible in scripts". If you're merely a recipient of (say) an Excel spreadsheet, then opening the Excel spreadsheet to reformat dates, etc., seems like a bad idea. But if you're in Case 2 (and you're compelled to use Excel), then reformatting dates as yyyy-mm-dd might be a good thing to do in Excel.

In Case 3, the advice might be to encourage whoever is entering the data to follow guidance here.

Finally, Cases 2 and 3 introduce issues that don't exist in Case 1. Sometimes the advice seems more applicable to Case 1 (e.g., on version control).

kbroman commented 8 years ago

I'm thinking mostly of 1.

iangow commented 8 years ago

I think that's right. You do say "I don't have much experience with data entry," so I guess you have views on the other use cases (these are probably the ones that excite me more). But perhaps advice on these belongs somewhere else. (My thoughts on Google Sheets in #5 perhaps pertain more to those; pretty much everything carries over from Excel to Google Sheets if entering data oneself, or entering data for someone else.)

Mention of the cases occur here and there, but the thrust is "if you're entering data, here's how to do it."