okffi / datakoulu

Apua avoimen datan käyttämiseen
0 stars 0 forks source link

Open Data as a part of Information Production Process #7

Open lainesam opened 10 years ago

lainesam commented 10 years ago

Minusta avoimen datan opetukseen tulisi ehdottomasti liittää ainakin yksi paketti kokonaisvaltaista tietotuotannon hallintaa. Tämän voisi nivoa esimerkiksi Total Data Quality Management viitekehykseen ja tätä kautta käytännön sovelluskohteisiin. Pelkän datan lisäksi siis käytäisiin läpi yleistasolla myös: a) mitä liittyy datan syntymiseen ennen kuin se on tietokannoissa. b) mitä liittyy datan muokkaamiseen kun data virrat kulkevat kannoista toiseen, organisaatioista toiseen, projekteista toiseen jne. c) mitä liittyy datan hyödyntämiseen eri käyttötarkoituksiin ja päätelmien validiuteen.

Pelkän sovelluksen kehittämisen lisäksi käytäisiin läpi yleistasolla myös: a) mikä on tietotuotantoprosessi yhteisöllisenä toimintana b) mitä on laadunvalvonta c) mitä on tietojen standardointi d) miten tiedon tuotantoprosessiin pitäisi aina rakentaa sekä tekninen että inhimillinen laadunvalvonta sekä tietojen standardointi/dokumentointi.

Pari linkkiä mitä itse esittäisin tavalla tai toisella: http://qualidat.aalto.fi/presentations/DRAFT_QUALIDAT_Open_Data_for_Open_Healthcare.pptx http://qualidat.aalto.fi/presentations/DRAFT_QUALIDAT_Description_of_Situated_Semantics.pptx http://qualidat.aalto.fi/presentations/ESITYS_tiedon_tuottajien_roolit_ja_kirjaamistilanteet.pptx

apoikola commented 10 years ago

Good idea, do you @lainesam know some introductory level reference materials to these? Short readings / video lectures etc.

lainesam commented 10 years ago

Lisäsin linkeiksi pari omaa draft-esitystäni aihepiiristä. Minulla on tästä aihealueesta omia esityksiäni niin Systeemityöyhdistyksen, Kokonaisarkkitehtuuriosaamisyhteisön kuin akateemisten konferenssienkin osalta.

Tarpeen mukaan pystyn kokoamaan kyllä aika helposti sekä käytännönläheistä että akateemista materiaalia sekä esimerkkitapauksia sekä maailmalta että omista tutkimuksistani.

apoikola commented 10 years ago

Thanks, "käytännönläheinen" is what we are looking for. We will see how thios fits in to the course outline and contact you then later on.

lainesam commented 10 years ago

Yes, definitely practice-oriented but also theory-informed. :-)

The issue would be how to summarize those slides, what kind of issues these mean in practice by live examples. And also how to solve these issues in practice with small effort. For example, if you open data, that is produced by someone, and will be used by others - how you should document, deliver and control the meaning and quality of data. It's up to others to do the same for their part of the complete information production process.

pe3 commented 10 years ago

@lainesam Can you come up with an exercise (possibly with data) to internalize the core learnings of what you are proposing? Would comparing hospital performances or different academic programmes bring up the core learnings you are thinking about?

miskaknapek commented 10 years ago

Hey Sami, et al. Sami: Good goings - if I've understood some of your key points correctly, the question of how "data" is to be understood, and how it differs from how we otherwise get "information" or "knowledge", is a very important one. A good question is whether to make general exercises, for people to understand these things, or domain specific exercises of the domain of the students. Probably a mixture of the two might make sense. Looking forward to hearing your suggestion to @pe3 's notes.

pe3 commented 10 years ago

@miskaknapek In slightly more concrete words I think @lainesam is pointing towards understanding different (methodological, political,...) biases in data production and use.

miskaknapek commented 10 years ago

thanks for the note - yes, I'd understood it. maybe my comments might not have reflected this accurately.

lainesam commented 10 years ago

@pe3 "Can you come up with an exercise (possibly with data) to internalize the core learnings of what you are proposing? Would comparing hospital performances or different academic programmes bring up the core learnings you are thinking about?"

Yes for both. National hospital benchmarking is currently open (aggregated) data.

About hospital benchmarking: http://www.thl.fi/fi_FI/web/fi/tilastot/aiheittain/erikoissairaanhoito/sairaaloiden_tuottavuus

Actual databases (=data files, cubes). http://www.thl.fi/fi_FI/web/fi/tilastot/tiedonkeruut/benchmarking/raportointi/tietokannat

The whole process and its quality control could be made as an example of "open data". At the same time it can be used to illustrate one common type of open data - aggregated public sector registry. And all the issues that are linked to it.

Education comparison might be also good example, since we all are directly familiar with health and education issues. More specialized issues such as football statistics or gardening data might not be so relevant for many people.

lainesam commented 10 years ago

@miskaknapek I think that examples and exercises should be made from different domains. However, if exercises or examples are done well, they are easily understood also by non-domain experts. People can easily recognize similar issues in their "own domain".

The issue is how to make people aware of the whole information production process and its quality control. So in the future, they understand how important good documentation and transparent data management is. Its important that others do it but also that everyone does their own share.

pe3 commented 10 years ago

@lainesam it might be a good exercise to combine several data sets which are only partly comparable. a harmonized data set is not ideal for that kind of exercise. i ran to those kind of problems when comparing finnish higher education programmes for a data journalism project. the names, the contents and the groupings of study programmes change every few years.

classical finnish example of comparability problems: the two official numbers for unemplyment.