mcveanlab / TreeWAS

MIT License
15 stars 6 forks source link

A feature request: phenotype transformation and --include option #9

Open jielab opened 3 years ago

jielab commented 3 years ago

Hi, there:

This tool is really nice. However, I have difficulty to follow the scripts to run it.

First, my phenotype file is like below, from running ukbconv. ICD

Can you please provide a script that could convert such a data into the format as below? ID R198 R104 M512 L720 M8414 L031 K802 S662 K267 1 1 1 1 0 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0_

Since there are ~20,000 ICD codes and 500,000 individuals, it might be too big to put all the codes in a single file. So, it would be necessary to have an option to specify a "phenotype inclusion file". For example, I might want to analyze the sub-tree for circulatory diseases, i.e., all ICD-codes starting with letter "I".

Please kindly let me know if that is doable.

Best regards, Jie