Closed scoker-me closed 3 years ago
Thanks, @scoker-me ! Please add the front matter to the post. Use this as an example: https://github.com/MERLTech/MERL-Center-public/blob/main/sample-frontmatter.md
Merging this PR, but we might need to make changes later
layout: blog-post title: insert authors:
images should be in the /assets/img/posts/ folder
featuredImage: https://user-images.githubusercontent.com/55405623/132376892-0271ee8c-8fff-4f52-8cda-9b7e27866dd2.png
outgoing: false outgoingUrl:
PROS AND CONS OF OPEN DATA
Melissa Edmiston, Stephanie Coker and Stephanie Jamila and Thembelihle Tshabalala
Open data is an increasingly important topic in MERL. Many MERL practitioners advocate for open data given the benefits of sharing data that others can use to analyze, reanalyze and draw new and beneficial conclusions. However, making data open does not come without risks and could result in unintended consequences. The following guide outlines some of the pros and cons of open data and things to consider when making your data open. For summary of the key points highlighted in the article, see Table 1.
Table 1:Summary of key points
PROS
Accessibility of data
One of the overarching benefits of open data is accessibility within a thematic area or sector. Data collection and cleaning can be expensive, and many projects or organizations are limited in their capacity. Making data open increases the number of datasets available for others to analyze and draw conclusions. This can result in:
Increased community engagement:
Open data has the potential to build a community around the data; bringing people together who are working on similar issues who can exchange ideas, findings and discuss challenges. This can encourage data collaboration rather than competitiveness. Both users and creators of open data have formed communities around the dataset or topic of interest, and these two groups are not mutually exclusive.
Improved efficiencies and reduced costs:
Access to open data increases the rate and ease of discovery, thus enabling researchers to have more resources to fortify their work across disciplines. Open data can be used to enhance data that is already at the disposal of organisations and companies of all sizes. Small companies can particularly benefit from open data that is in an industry in which they would like to expand. Open data can also reduce the chance of duplication in data collection efforts, thus saving time and money for organizations.
Progress and innovation:
Because open data is offered without a monetary barrier, more people have access and can use new methods of analysis, which can further the field of study or contribute to programmatic advancements, encouraging innovation and progress.
Increased transparency
Open data can also lead to increased transparency for users around topics or issues that the data addresses. Since open data is freely and publicly available, it lowers the barrier for the general public (and specific stakeholders) to understand the topic or issue the data addresses. Having the data at hand also empowers stakeholders to act on the data, advocating for themselves and their community.
Reduced corruption
Open data is an important element in the fight against corruption. It strengthens public integrity and accountability between policymakers, government, companies, and citizens through the use of evidence that is open data of either maladministration, governance gaps or blatant corruption. While a significant amount of important and useful government data remains inaccessible, there are examples of governments taking stances to support open data initiatives.
Interpretation of data
Open data allows additional individuals to analyze the data and interpret and validate the findings in numerous ways. A Mckinsey report on the benefits of open data stated that open data has three value levers namely: decision making, innovation and accountability. It also highlighted the fact that open data value levers benefit a wide range of stakeholders and that a single open-data initiative has the ability to empower governments, private sector as well as NGOs but derive different value depending on the use and the interpretation of the data.
CONS/RISK FACTORS
Incorrect use of data and missing data
When using open data, proper consideration of data collection methods and metadata is paramount for accuracy. When these are misunderstood, erroneous conclusions may be drawn from data.
Privacy and Consent
Data, whether open or proprietary, is regulated by laws that aim to guard the rights’ of individuals and protect against malicious use of data. The passage of the EU’s General Data Protection Regulation (GDPR) marked the first, enforceable legislation on data privacy and has been touted as the most significant regulatory development in information policy, influencing development of data privacy policy in other territories.
Mosaic Effect
The mosaic effect is a term used when discussing confidentiality. It is derived from the mosaic theory of intelligence gathering, in which disparate pieces of information become significant when combined with other types of information. Applied to data in the MERL sector, this occurs when multiple datasets are linked to reveal new information. Even if data is appropriately anonymized and efforts are made to remove personal identifiers, if there are multiple datasets containing similar or complementary information, it’s possible to determine identity based on the various data combined across the datasets such as gender, location, educational status etc. Resources are now available to help MERL practications think about how their data may contain certain linkages or risks that may require additional levels of security or anonymization. Figure 1 displays an example of how identity theft can occur when the mosaic effect takes place.
Figure 1: Mosaic Effect Example of Identity Theft https://user-images.githubusercontent.com/55405623/132376892-0271ee8c-8fff-4f52-8cda-9b7e27866dd2.png
Costs and sustainability of open data projects
Open data has been described as a public good. While the data is offered for free, there is usually a huge cost for the organization implementing the open data initiative. According to recent literature, beginning costs of open data initiatives vary from €20,000 to €100,000 per organisation. Start up costs are also followed by adaptation costs, infrastructural costs, and maintenance/operational costs. Additionally, from an NGO/non-profit perspective, funding these open data projects is also reliant on being able to pitch the usefulness of open data to funders. There is a risk of funders’ priorities changing, which can harm the long-term sustainability of the open data project. Another risk is that if funders’ and users’ agendas don’t align, the open data project may end up not serving the needs of the people who actually use the data. All of these sustainability factors affect decision-making around open data initiatives and often end up proving to be insurmountable.