DKMS / Hapl-o-Mat

A software for haplotype inference
Other
11 stars 7 forks source link

running just one allele #13

Closed mbraakman closed 2 years ago

mbraakman commented 2 years ago

Hi, I've been using Haplomat for a while, but i get an error now i haven't gotten before.

I tried to run Haplomat for just DQB1 data, if I run g resolution or 4d it works. When selecting P or G I get a 'segmentation fault (core dumped)'

any suggestions?

 _Hapl-o-Mat
 Copyright (C) 2016 DKMS gGmbH

#########Initialization MA format #########Parameters I/O Input: Matchispopulation/DQA-DQB114042022.dat Output haplotypes: run/haplotypes.dat Output genotypes: run/genotypes.dat Output estimated haplotype frequencies: run/hfs.dat Output epsilon and log(L): run/epsilon.dat #########Parameters resolving genotypes Minimal frequency of genotypes: 1e-05 Loci with target allele resolutions: DQB1 : G Apply ambiguity filter: no #########Parameters EM algorithm Haplotype frequency initialization: perturbation Epsilon: 1e-06 Cut haplotype frequencies: 1e-06 Renormalize haplotype frequencies Zero: 1e-14 Seed: 1000

#########Data preprocessing Segmentation fault (core dumped)_

mbraakman commented 2 years ago

I now get it with multiple alleles as wel

usolloch commented 2 years ago

Hi, we have also seen this error message sporadically. Usually this indicates a problem in the input file. For example if data for one of the input loci is missing in one of the input lines. Also a blank instead of a tab separator in the input file may be the cause. If you wish, you can send us a short input file example that leads to the "segmentation fault" error message and we will try to locate the problem.

mbraakman commented 2 years ago

sorry for not responding, the problem seems to have solved it's self...

I do have another question, how would i create a HF file for DRB1-DRB345

my file contains NNNN or an allele, but the NNNN is not accepted. i then tried to replace the NNNN by [Blank] but this also doesn't work.

any suggestions?

usolloch commented 2 years ago

It is possible to create haplotype frequencies for data including DRB345, but you have to minimally manipulate the input files Hapl-o-Mat uses for calculations:

The folder "data" contains a file "AllAllelesExpanded.txt". You have to extend this file by three lines that define the "NNNN" entries.

Try to add just the three lines:

DRB3NNNN DRB3NNNN DRB4NNNN DRB4NNNN DRB5NNNN DRB5NNNN

(separated by tab!). These lines tell the Hapl-o-Mat that NNNN-data is valid in these three loci and may only be translated into itself.

Good luck!

mbraakman commented 2 years ago

Great thank you!!!

Regards, Martijn

Martijn Braakman Search Consultant / Immunogenetic consultant Transplant Center Services Direct phone : +31 71 568 5333 Transplant Center Services TCS phone : +31 71 568 5330 (24/7) Fax : +31 71 711 3101 www.matchis.nlhttp://www.matchis.nl/ [Matchis Logo]https://www.matchis.nl/

[Matchis Instagram]https://www.instagram.com/stichtingmatchis [Matchis Facebook] https://facebook.com/stichtingmatchis/ [Matchis LinkedIn] https://www.matchis.nl/linkedin [Matchis Twitter] https://twitter.com/matchisNL/ [Matchis YouTube] http://www.youtube.com/channel/UCZ31uzINIPlqaeTIm2OiSnw/videos

From: Ute Solloch @.> Sent: donderdag 14 juli 2022 13:31 To: DKMS/Hapl-o-Mat @.> Cc: Martijn Braakman @.>; Author @.> Subject: Re: [DKMS/Hapl-o-Mat] running just one allele (Issue #13)

You don't often get email from @.**@.>. Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification

It is possible to create haplotype frequencies for data including DRB345, but you have to minimally manipulate the input files Hapl-o-Mat uses for calculations:

The folder "data" contains a file "AllAllelesExpanded.txt". You have to extend this file by three lines that define the "NNNN" entries.

Try to add just the three lines:

DRB3NNNN DRB3NNNN DRB4NNNN DRB4NNNN DRB5NNNN DRB5NNNN

(separated by tab!). These lines tell the Hapl-o-Mat that NNNN-data is valid in these three loci and may only be translated into itself.

Good luck!

— Reply to this email directly, view it on GitHubhttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FDKMS%2FHapl-o-Mat%2Fissues%2F13%23issuecomment-1184334371&data=05%7C01%7Cmartijn.braakman%40matchis.nl%7Cfa5370eec2a24683978a08da658c46d6%7Cdaf443a81a8a4b34af50e28bca4a706a%7C1%7C0%7C637933950421984773%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YvggZHPoTGua8%2FHOO%2BY00WLGPa%2FZ0%2Fi%2F7WsiJJmb4ew%3D&reserved=0, or unsubscribehttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAY2QJJICJM77JGLLH6GALD3VT7253ANCNFSM5T6YMLSA&data=05%7C01%7Cmartijn.braakman%40matchis.nl%7Cfa5370eec2a24683978a08da658c46d6%7Cdaf443a81a8a4b34af50e28bca4a706a%7C1%7C0%7C637933950421984773%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=x5D25KLblHNR57tFTpcKh04qp2jeABqKq82vb3zHMF0%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.**@.>>

mbraakman commented 2 years ago

Hi, I've been using Haplomat for a while, but i get an error now i haven't gotten before.

I tried to run Haplomat for just DQB1 data, if I run g resolution or 4d it works. When selecting P or G I get a 'segmentation fault (core dumped)'

any suggestions?

 _Hapl-o-Mat
 Copyright (C) 2016 DKMS gGmbH

#########Initialization MA format #########Parameters I/O Input: Matchispopulation/DQA-DQB114042022.dat Output haplotypes: run/haplotypes.dat Output genotypes: run/genotypes.dat Output estimated haplotype frequencies: run/hfs.dat Output epsilon and log(L): run/epsilon.dat #########Parameters resolving genotypes Minimal frequency of genotypes: 1e-05 Loci with target allele resolutions: DQB1 : G Apply ambiguity filter: no #########Parameters EM algorithm Haplotype frequency initialization: perturbation Epsilon: 1e-06 Cut haplotype frequencies: 1e-06 Renormalize haplotype frequencies Zero: 1e-14 Seed: 1000

#########Data preprocessing Segmentation fault (core dumped)_

the problem seems to have returned? I've checked, no missing alleles and it only occurs when running DQB1/DPB1 on P or G group level.

usolloch commented 2 years ago

Hi, thank you very much for your message. We were able to reproduce the error on our end and trace it back to a change in the format of the IMGT/HLA data to be downloaded. We are working on the fix and hope to provide it to you tomorrow.

usolloch commented 2 years ago

Hi Martijn,

we have merged a small patch into Hapl-o-Mat files this morning. The changes involve solely files BuildLargeG.py and BuildP.py in folder prepareData. This should resolve your problems with DQB1 and DPB1 calculations.

If you still encounter any problems, please do not hesitate to contact us. Your feedback is very appreciated.

Regards, Ute