dined-io / dyned

BSD 3-Clause "New" or "Revised" License
0 stars 2 forks source link

Duplicated #30

Closed jurra closed 2 years ago

jurra commented 2 years ago

When I extract metadata from the 'studies.json' I get two with different descriptions: These studies should then have different names and ids...

[
    {'id': 1, 'name': 'Dutch elderly', 'description': 'This data was measured in 1982 in cooperation with Gemeentelijke Dienst Verpleging en Verzorging, the association for elderly care and housing in The Hague.', 'created_at': '2020-04-15T12: 19: 12+00: 00', 'updated_at': '2020-04-15T12: 25: 12+00: 00', 'sort_position': 7, 'code': 'gdvv1984', 'publication_date': '1984', 'measurement_date': '', 'sources': '', 'people': '- J.F.M. Molenbroek\n- J.J. Houtkamp\n- A.K.C. Burger', 'publications': '- Anthropometry of elderly people in the Netherlands; research and applications by JFM Molenbroek. Applied Ergonomics, volume 18 (1987), issue 3, page 187-199\n- IO bijzondere onderwerpen 16, bejaardenanthropometrie by JFM Molenbroek, JJ Houtkamp and AKC Burger. Technische Hogeschool Delft,
        1983', 'url': '', 'has_shape_data': False, 'isLocked': 1, 'min_age': 53, 'max_age': 106, 'individuals_count': 822, 'age_groups': [
            {'id': 48, 'sort_position': None, 'min_age': 53, 'max_age': None, 'study_id': 1, 'name': '53+'
            }
        ], 'measures': [
            1,
            2,
            3,
            4,
            6,
            9,
            13,
            14,
            16,
            17,
            18,
            21,
            22,
            24,
            29,
            30,
            32,
            43,
            47,
            56,
            75,
            76,
            77,
            78,
            79,
            80
        ]
    },
    {'id': 3, 'name': 'Dutch elderly', 'description': 'The TU Delft Geron project was a national study in which 750 subjects, who lived independently, were assessed. In total about 80 variables, all more or less important for product use, were measured. The sample consisted of four age groups ranging from 50 to over 80 years of age; a group of young people (20 - 30 years) was also studied for the purpose of comparison (see DINED 2004 table). Women and men participated in about equal numbers.', 'created_at': '2020-04-15T12: 20: 06+00: 00', 'updated_at': '2020-04-15T12: 25: 10+00: 00', 'sort_position': 5, 'code': 'geron1998', 'publication_date': '1998', 'measurement_date': '', 'sources': '', 'people': '- C.E.M. van Beijsterveldt\n- J.M.Dirken (Project leader)\n- J.J.Houtkamp\n- J.F.M.Molenbroek\n- L.P.A. Steenbekkers\n- A.I.M.Voorbij', 'publications': '- L.P.A. Steenbekkers, C.E.M. van Beijsterveldt (eds), J.M.Dirken, J.F.M.Molenbroek, A.I.M. Voorbij and J.J. Houtkamp Design-relevant characteristics of ageing users, Delft University Press 1998 (per 1-1-2006 IO-Press Amsterdam)', 'url': '', 'has_shape_data': False, 'isLocked': 1, 'min_age': 50, 'max_age': 94, 'individuals_count': 627, 'age_groups': 
    [
            {'id': 9, 'sort_position': None, 'min_age': 50, 'max_age': 54, 'study_id': 3, 'name': '50–54'
            },
            {'id': 10, 'sort_position': None, 'min_age': 55, 'max_age': 59, 'study_id': 3, 'name': '55–59'
            },
            {'id': 11, 'sort_position': None, 'min_age': 60, 'max_age': 64, 'study_id': 3, 'name': '60–64'
            },
            {'id': 12, 'sort_position': None, 'min_age': 65, 'max_age': 69, 'study_id': 3, 'name': '65–69'
            },
            {'id': 13, 'sort_position': None, 'min_age': 70, 'max_age': 74, 'study_id': 3, 'name': '70–74'
            },
            {'id': 14, 'sort_position': None, 'min_age': 75, 'max_age': 79, 'study_id': 3, 'name': '75–79'
            },
            {'id': 15, 'sort_position': None, 'min_age': 80, 'max_age': None, 'study_id': 3, 'name': '80+'
            }
        ], 'measures': [
            2,
            3,
            4,
            5,
            6,
            9,
            12,
            13,
            14,
            15,
            16,
            17,
            19,
            20,
            21,
            22,
            25,
            30,
            31,
            32,
            33,
            41,
            42,
            43,
            44,
            45,
            46,
            54,
            55,
            56,
            57,
            58,
            59,
            60,
            61,
            62,
            63,
            64,
            65,
            66,
            67,
            68,
            69,
            70,
            71,
            72,
            73,
            74,
            80
        ]
    }
]
jurra commented 2 years ago

The problem is not that the studies are duplicated, the names of the studies are duplicated. But there is another related bug, when I query from study number 6 until study 11, I get no data from the mysql database. Also study 12 returns no data.

As a result we have less studies in files than the ones we can actually count and appear in the studies.json metadata.

toonhuysmans commented 2 years ago

This might be related to the studies that have no measurements attached but only have summary statistics stored in the database. The 1D database tool shows all studies. The Ellipse tool shows only those studies for which we have measurements for the individual, but not those which only have summary statistics.

I'm not sure how this differences is exposed in the API, if it even is..

jurra commented 2 years ago

Thanks for the clarification @toonhuysmans maybe we can address that in the next catch up.