datacarpentry / spreadsheets-socialsci

Data Organization in Spreadsheets for Social Scientists
http://datacarpentry.github.io/spreadsheets-socialsci/
Other
21 stars 68 forks source link

Update lesson to work with google sheets #97

Open colinquirk opened 4 years ago

colinquirk commented 4 years ago

It's not clear to me if this lesson would work with google sheets out of the box, but considering there is a very small cost to getting google sheets working on your personal machine, I think it would be a nice alternative to libreoffice which has known issues reported in the lesson (e.g. different delimiters).

angela-li commented 4 years ago

Yes, agreed that Google Sheets may be a better alternative to LibreOffice. For data validation, Google Sheets also has more functionality.

chris-prener commented 4 years ago

I like this suggestion @colinquirk, and thanks for your patience! I agree with @angela-li that it would be perhaps preferred over LibreOffice.

However, I'm not in a place to do the makeover myself. Is this something you would like to contribute to, @colinquirk ?

colinquirk commented 4 years ago

Sorry, I don't think I'll be able to take this on. Maybe a new instructor will be interested in taking a shot.

chris-prener commented 4 years ago

No worries @colinquirk !

kerchner commented 4 years ago

Has anyone yet tried running through the material to at least determine at which points Google Sheets users would have to do something different?

chris-prener commented 4 years ago

not that I am aware of, @kerchner!

dolsysmith commented 4 years ago

Having taught this lesson last week with Excel, I just ran through the demonstrations and exercises using Google Sheets. With one exception -- in the Data Validation lesson -- the functionality in Sheets tracks closely to Excel for this content. See my summary below:

In summary, I think Sheets would probably be a viable option for those without access to Excel. (And although this topic doesn't come up in the Lessons, I have found in my own work that Sheets can be easier to work with than Excel when importing CSV data from other sources. Excel, for instance, tends to make assumptions about certain data types -- e.g., dates and long integers -- that can prove problematic if you don't correct for them on input. Sheets's type system seems a bit more flexible in this respect.)

(This comment submitted as part of the Carpentries Instructor Training Checkout process.)

chris-prener commented 4 years ago

thanks so much @dolsysmith - this is wonderful feedback! It looks like moving forward on a Google Sheets version of this is a viable option.

bkmgit commented 4 years ago

Made a pull request for a version using LibreOffice #121 . LibreOffice is also available as a cloud version through Collabora Office. Onlyoffice is also available in the cloud, but the data validation section would need to be changed for this. WPS Office has an online version that at present is free to use and would allow for minimal additions to the validation section.

lvermeyden commented 3 years ago

If help is still needed to update the lesson to work with google sheets I am happy to assist with that.

ndporter commented 1 year ago

Copying feedback received on Slack from @froggleston on data privacy and cloud services here in case people want to work on this issue:

I think this is a good point to raise, as human-related or identifiable data shouldn't be stored or processed in formats processable by cloud providers, e.g. Google Docs are processed for content by Google and indexed to allow the efficient searching the service provides. Users should be made aware that cloud providers are able to and regularly do scan content on their systems for malware protection, viruses etc so I would at least recommend that a note is added to the lessons that human-related data should not be stored or analysed in open cloud formats like google docs or sheets.

Looking at organisations that have very well-documented data policies relating to data services, e.g. Univ Michigan (https://safecomputing.umich.edu/dataguide/), Personally Identifiable Information is permitted within Google suite services (sheets, docs etc), but Protected Health Information and Sensitive Identifiable Human Subject data are not. It might be helpful to suggest that staff and students at their respective institutions would need to make sure they comply with their institutions' data policies.

kerchner commented 1 year ago

Thinking about the learning objectives of this workshop, they transcend any particular spreadsheet program, so if someone later has data that requires a more secure environment, they will easily be able to apply what they learned in this lesson to a different program. Google Sheets, equally accessible to everyone, removes the software setup barrier for Carpentries lesson participants, and could result in all participants being in the exact same program. In my opinion that pedagogical benefit far outweighs any future concerns about working with sensitive data outside of the workshop - where, again, the lessons learned from the workshop could be applied in a different spreadsheet program.