Closed github-actions[bot] closed 2 weeks ago
Just posted discussion post but here it is below as well:
Based on my experiences (in the class so far - I don't work in this field) and the reading, some situations in Earth Data Science where the lack of context could be misleading, may be dealing with crowdsourced data. To my knowledge, crowdsourced data doesn't actually say who collected the data (which may be related to data privacy). As a possible scenario, if most users that upload data to iNaturalist are men, or are mostly white, educated, I think that also tells a different story that isn't told and may influence analysis of the data.
Some things I can do as a data scientist to provide more context for my analysis are never take the numbers in a dataset at face value due to the imbalances of power in data setting. I need to interrogate the "context, limitations, and validity of the data" I am using and ask myself what is missing. I need to ensure that the data's "situatedness" is taken into account as I am building upon whatever dataset that might be, with my analysis. It's kind of like needing a foundation to build a house; without having sound structural support upon which to build, a house would fall. I think most things in life, fail in one way or another without proper context and being situated. Some struggles I might face to provide more context for my analysis are a lack of context or not enough context coming from the dataset I am using. Also, a lack of time or resources to provide context could also be a struggle, however I couldn't in good conscious publish analysis on something without context, this seems unethical to me. Any struggles or issues I face during analysis need to be called out in limitations to the analysis.
Some things I can do as a data provider to help scientists keep data in context are:
When I am gathering data or creating a dataset I need to understand the mindset that I am cooking data and call myself out on what is missing from my dataset (what was left out) and what that tells someone about the dataset. Explicitly say how the dataset was intended to be used (or how it should not be used) with clear variable/table/header names, even a dictionary of definitions for certain words. Incorporate culture, context, and nuance to gathering data and creating datasets. Encourage data users to interrogate a dataset I create, but also provide my own interrogation as a starting point - "ask questions about the social, cultural, historical, institutional, and material conditions under which that knowledge was produced, as well as about the identities of the people who created it.” There's nothing I love more than being self-critical. Limitations are key and thoroughly explaining what those are and any associated ethical obligations. Aside from answering these questions I wanted to say that I LOVED this reading (bought this book immediately). I went to a women's college, one of the seven sisters, and feminist theory is entrenched into my being and everything I do. If I could, I would take a whole class or certificate in Data Feminism or Data Intersectionality because it's empowering and awesome.
The context problem is data science I think speaks to how this field is heavily dominated by men and has been since it's creation. I am not shocked of the 'issue of context' because of this.
After reading this chapter, it made me think of some great authors - Kimberle Crenshaw who coined 'intersectional feminism' and she has great TED talks that are very moving The Urgency of Intersectionality . As well as Chimamanda Ngozi Adichie - Danger of a Single Story and We Should All be Feminists .
Read and respond to the discussion for the next two weeks: https://github.com/earthlab-education/Earth-Analytics-AY24/discussions/462