Closed lgragert closed 6 months ago
Our workaround is to first do a two-field redux, then do the ARD redux.
I see that it's correctly finding the P group.
❯ pyard -g 'DRB4*01:03:19' -r "P"
DRB4*01:01P
Or do P
first and then lgx
>>> ard.redux(ard.redux('DRB4*01:03:19', 'P'), 'lgx')
'DRB4*01:01'
I no longer agree with the suggested workaround P
first and then lgx
.
For the allele C*02:10:02
, lgx
gives the correct behavior but the workaround is incorrect.
>>> ard.redux("C*02:10:02", 'lgx')
'C*02:10'
>>> ard.redux(ard.redux('C*02:10:02', 'P'), 'lgx')
'C*02:02'
We do have a ping
mode. i.e. P in G
I think this maybe what you want.
>>> my_config = {"ping": True}
>>> ard = pyard.init(config=my_config)
>>> ard.redux("DRB4*01:03:01", 'lgx')
'DRB4*01:01'
Bummer, I tried ping mode and can't get it to work as expected either.
Any tips?
>>> my_config = {"ping": True}
>>> ard = pyard.init(config=my_config)
Creating /var/folders/_g/yg03ywj55jz58thbzgcwn21h0000gp/T/pyard-lgragert/pyard-Latest.sqlite3 as cache.
Version: 3560
>>> ard.redux("DRB4*01:03:19", 'lgx')
'DRB4*01:03'
>>> ard.redux("DRB4*01:03:19", 'P')
'DRB4*01:01P'
Interesting. It works for DRB4*01:03:01
but not DRB4*01:03:19
>>> import pyard
>>> my_config = {"ping": True}
>>> ard = pyard.init(config=my_config)
>>> ard.redux("DRB4*01:03:19", 'lgx')
'DRB4*01:03'
>>> ard.redux("DRB4*01:03:01", 'lgx')
'DRB4*01:01'
Guess today is DRB4
breaking day :) I'll take a look.
See above, I revised a response and now disagree with your suggested workaround.
The solution is to use the ping
mode, but I'll have to investigate why it's not working for others besides DRB4*01:03:01
No no no... ping mode doesn't work either, as shown above. The only valid workaround we found is to first do a two-field redux, then do the ARD redux.
Reducing twice with ping
mode works for the above DRB4 cases.
i.e.
ard.redux(ard.redux(allele)
Here are the DRB4s reduced twice and compared with the P
version.
DRB4*01:03 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:02 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:03 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:04 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:05 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:06 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:07 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:08 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:09 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:10 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:11 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:12 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:13 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:14 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:15 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:16 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:17 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:18 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:19 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:20 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:21 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:22 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:23 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:24 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:25 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:26 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:27 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:28 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:29 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:30 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:31 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:32 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:33 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:34 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:35 lgx='DRB4*01:03', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:01 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:03 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:04 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:05 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:06 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:07 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:08 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:09 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:10 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:11 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:12 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:14 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:15 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:16 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:17 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:01:18 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:02:01 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
DRB4*01:03:02:02 lgx='DRB4*01:01', lgx^2=DRB4*01:01 P: DRB4*01:01P Same: True
C*02:10:02 lgx='C*02:10', lgx^2=C*02:02 P: C*02:02P Same: True
The C*02:10:02
appears in P group but not in G group.
From hla_nom_p.txt
:
9119,C*,C*02:10:02,C*02:02P,C*02:10,C*02:10:02,C*02:02
So shouldn't C*02:10:02
reduce to C*02:02
in ping
mode ?
I know there were issues we had with C*02:10
s previously. I don't recall the details right now.
I suppose you could fix it by automatically applying the redux twice, as the user would not know for what alleles that would need to be done.
Yes, C*02:10:02
should reduce to C*02:02
under the lgx
redux.
I'll organize a meeting on this. Its a bit messy when you consider that no combination of lgx^N or ping:true/false fixes it without introducing another issue
I've confirmed that DRB4*01:03
indeed occurs in the most recent GLID files I've analyzed and propagates into the haplotype frequency estimates, which has the consequence of inappropriately decreasing the frequency of DRB4*01:01g
.
New NMDP GLID/pull files should be generated and new haplotype frequency estimates obtained after the fix is applied to pyARD.
grep "DRB4\*01:03" glid.txt
7013,DRB4*01:03+DRB4*01:03
17360,DRB4*01:01+DRB4*01:03
345530,DRB4*01:01/DRB4*01:07/DRB4*01:10/DRB4*01:11/DRB4*01:12/DRB4*01:13/DRB4*01:14N/DRB4*01:17/DRB4*01:18/DRB4*01:20/DRB4*01:21/DRB4*01:22/DRB4*01:23/DRB4*01:24/DRB4*01:25/DRB4*01:26/DRB4*01:27/DRB4*01:28/DRB4*01:29/DRB4*01:30/DRB4*01:31/DRB4*01:32/DRB4*01:33/DRB4*01:34/DRB4*01:36/DRB4*01:37/DRB4*01:38N/DRB4*01:39/DRB4*01:40/DRB4*01:41/DRB4*01:42/DRB4*01:44/DRB4*01:45/DRB4*01:47/DRB4*01:48/DRB4*01:49/DRB4*01:50/DRB4*01:51/DRB4*01:52/DRB4*01:53/DRB4*01:54/DRB4*01:55/DRB4*01:56N/DRB4*01:57N/DRB4*01:58/DRB4*01:59/DRB4*01:61N/DRB4*01:63/DRB4*01:64/DRB4*01:65N/DRB4*01:66/DRB4*01:67/DRB4*01:68/DRB4*01:69/DRB4*01:70/DRB4*01:71N/DRB4*01:73/DRB4*01:74/DRB4*01:75/DRB4*01:77/DRB4*01:78/DRB4*01:80N/DRB4*01:81/DRB4*01:82/DRB4*01:84N/DRB4*01:86/DRB4*01:87/DRB4*01:88/DRB4*01:89/DRB4*01:91/DRB4*01:92/DRB4*01:93/DRB4*01:94/DRB4*01:95/DRB4*01:96/DRB4*01:97/DRB4*01:98/DRB4*01:99/DRB4*01:100/DRB4*01:101/DRB4*01:102/DRB4*01:103/DRB4*01:104/DRB4*01:105/DRB4*01:106/DRB4*01:108N/DRB4*01:109/DRB4*01:110/DRB4*01:111/DRB4*01:112/DRB4*01:113N/DRB4*01:114/DRB4*01:115N/DRB4*01:116+DRB4*01:03
388288,DRB4*01:02+DRB4*01:03
433241,DRB4*01:01/DRB4*01:02/DRB4*01:04/DRB4*01:05/DRB4*01:07/DRB4*01:08/DRB4*01:10/DRB4*01:11/DRB4*01:12/DRB4*01:13/DRB4*01:14N/DRB4*01:15/DRB4*01:16N/DRB4*01:17/DRB4*01:18/DRB4*01:19/DRB4*01:20/DRB4*01:21/DRB4*01:22/DRB4*01:23/DRB4*01:24/DRB4*01:25/DRB4*01:26/DRB4*01:27/DRB4*01:28/DRB4*01:29/DRB4*01:30/DRB4*01:31/DRB4*01:32/DRB4*01:33/DRB4*01:34/DRB4*01:35/DRB4*01:36/DRB4*01:37/DRB4*01:38N/DRB4*01:39/DRB4*01:40/DRB4*01:41/DRB4*01:42/DRB4*01:43/DRB4*01:44/DRB4*01:45/DRB4*01:46/DRB4*01:47/DRB4*01:48/DRB4*01:49/DRB4*01:50/DRB4*01:51/DRB4*01:52/DRB4*01:53/DRB4*01:54/DRB4*01:55/DRB4*01:56N/DRB4*01:57N/DRB4*01:58/DRB4*01:59/DRB4*01:60/DRB4*01:61N/DRB4*01:62/DRB4*01:63/DRB4*01:64/DRB4*01:65N/DRB4*01:66/DRB4*01:67/DRB4*01:68/DRB4*01:69/DRB4*01:70/DRB4*01:71N/DRB4*01:72/DRB4*01:73/DRB4*01:74/DRB4*01:75/DRB4*01:77/DRB4*01:78/DRB4*01:80N/DRB4*01:81/DRB4*01:82/DRB4*01:83/DRB4*01:84N/DRB4*01:86/DRB4*01:87/DRB4*01:88/DRB4*01:89/DRB4*01:91/DRB4*01:92/DRB4*01:93/DRB4*01:94/DRB4*01:95/DRB4*01:96/DRB4*01:97/DRB4*01:98/DRB4*01:99/DRB4*01:100/DRB4*01:101/DRB4*01:102/DRB4*01:103/DRB4*01:104/DRB4*01:105/DRB4*01:106/DRB4*01:108N/DRB4*01:109/DRB4*01:110/DRB4*01:111/DRB4*01:112/DRB4*01:113N/DRB4*01:114/DRB4*01:115N/DRB4*01:116/DRB4*01:117/DRB4*01:120/DRB4*01:121N/DRB4*01:122/DRB4*01:125/DRB4*01:131/DRB4*01:133/DRB4*01:135/DRB4*01:137/DRB4*01:139/DRB4*01:140/DRB4*01:149N/DRB4*01:150/DRB4*01:154/DRB4*01:157/DRB4*01:159N/DRB4*01:160/DRB4*01:163/DRB4*01:170/DRB4*01:172+DRB4*01:03
601755,DRB4*01:03+DRB4*01:44
862578,DRB4*01:03+DRB4*01:31
1131079,DRB4*01:03+DRB4*01:05
I suppose you could fix it by automatically applying the redux twice, as the user would not know for what alleles that would need to be done.
Yes,
C*02:10:02
should reduce toC*02:02
under thelgx
redux.
See PR #323 that does reduction twice if necessary in ping
mode.
import pyard
ard = pyard.init(config={'ping': True})
ard.redux('C*02:10:02', 'lgx')
# 'C*02:02'
We can discuss the PR during the meeting.
This allele is included in DRB4*01:01P but is not currently in any G-group.
https://www.ebi.ac.uk/ipd/imgt/hla/alleles/allele/?accession=HLA28544
The correct result of this redux should be DRB4*01:01.