EHDEN / ETL-UK-Biobank

ETL UK-Biobank
https://ehden.github.io/ETL-UK-Biobank/
13 stars 4 forks source link

Zeroes dropped from read code #295

Closed MaximMoinat closed 3 years ago

MaximMoinat commented 3 years ago

Among most occurring measurement codes are many read codes where the trailing '00's have been dropped, see table 9 in CDMInspection.

<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">

ROW_NUM | SOURCE VALUE | #RECORDS | #SUBJECTS -- | -- | -- | -- 1 | 2469. | 94,500 | 4,300 2 | 246A. | 94,300 | 4,300 3 | 426.. | 46,200 | 15,200 4 | 22A.. | 40,300 | 4,200 5 | 22K.. | 36,800 | 4,200 6 | 42J.. | 35,800 | 6,500

MaximMoinat commented 3 years ago

This adding of 0's is already implemented here: https://github.com/EHDEN/ETL-UK-Biobank/blob/b3bda934641124fc788072d52c0d13298503b83d/src/main/python/util/code_cleanup.py#L50-L67

We have to find out why it does not work in these cases.

spiros commented 3 years ago

Hm, it could be because Read codes are case-sensitive so maybe you are adding the suffix .00 to some codes that actually dont exist?

On Mon, May 17, 2021 at 11:15 AM Maxim Moinat @.***> wrote:

This adding of 0's is already implemented here:

https://github.com/EHDEN/ETL-UK-Biobank/blob/b3bda934641124fc788072d52c0d13298503b83d/src/main/python/util/code_cleanup.py#L50-L67

We have to find out why it does not work in these cases.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EHDEN/ETL-UK-Biobank/issues/295#issuecomment-842118762, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFZZGGFBNWLQVXD6HE5ULTODGDRANCNFSM442QBLZA .