AlexanderDilthey / MHC-PRG

Population Reference Graphs for the HLA and MHC.
GNU General Public License v3.0
34 stars 13 forks source link

compiling error #17

Closed mengyao closed 8 years ago

mengyao commented 8 years ago

Dear Alexander,

Thank you for making this interesting program. I meet this following error when I try to compile it: ... In file included from readFilter/readFilter.cpp:8:0: readFilter/readFilter.h:19:30: fatal error: api/BamAlignment.h: No such file or directory

include "api/BamAlignment.h"

                          ^

compilation terminated. make: *\ [../obj/readFilter.o] Error 1

May I ask how can I fix this problem?

Yours,

Mengyao

mengyao commented 8 years ago

Dear Alexander,

The compiling issue is solved by revising the makefile.

I'm now trying to run your program. In your HLA-PRG.md file, it says to use this command line: ./HLAtypeinference.pl --actions pnai --sampleIDs SAMPLEID --BAMs /path/to/indexed/bam.bam --referenceGenome /path/to/referenceGenome/as/one/fasta/file

May I ask what is the SAMPLEID ?

Now I have the whole genome sequenced NA12878 bam. Can I try your program on this data? If so, what SAMPLEID should I use, or can I don't write this in the command line?

The HLA*PRG data package mentioned in HLA-PRG.md is very large. May I ask how to use this data package?

Many thanks,

Mengyao

mengyao commented 8 years ago

Dear Alexander,

I've tried to type the NA12878 using HLA*PRG but failed. Hope I could get help from you with how to run your program correctly.

I constructed the file hierarchy and used this following command line to run your program: ./HLAtypeinference.pl --actions pnai --sampleIDs NA12878 --BAMs /home/mengyao/data/hla/test/NA12878-Garvan-Vial1.fda_challenge_workflow_firepony.realigned.base_recalibrated.merged.bam --referenceGenome /home/mengyao/data/hla/test/human_g1k_v37_decoy.fasta &

The program aborted after about 1.5 hours, and generated these two files only: reads.p_1 reads.p_2

They seems are fastq files.

May I ask how can I get the expected HLA typing results? Would you mind to suggest me how to fix the problem?

Many thanks,

Mengyao

AlexanderDilthey commented 8 years ago

Hi Mengyao,

Has the problem with the data package been sorted out? If not, could you describe the problem more specifically?

Regarding the problem: can you capture STDOUT and STDERR by append a '&> output.txt' to the command?

I seems that "positive extraction" (the 'p' bit of 'pnai') has produced some output. Perhaps "negative extraction" (the 'n" bit) failed. It requires a bit of memory - say at least 80G. Are you running on a high-memory machine?

mengyao commented 8 years ago

Dear Alexander,

Thank you so much for your suggestions. They are very helpful.

I downloaded the data package fully. The data are expended into the folder HLA-PRG.md suggests. The expended data package size is 63G. May I ask is this size seems correct?

I’ll try your suggestions and let you know later.

Many thanks,

Mengyao

On May 27, 2016, at 4:32 PM, Alexander Dilthey notifications@github.com wrote:

Hi Mengyao,

Has the problem with the data package been sorted out? If not, could you describe the problem more specifically?

Regarding the problem: can you capture STDOUT and STDERR by append a '&> output.txt' to the command?

I seems that "positive extraction" (the 'p' bit of 'pnai') has produced some output. Perhaps "negative extraction" (the 'n" bit) failed. It requires a bit of memory - say at least 80G. Are you running on a high-memory machine?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/AlexanderDilthey/MHC-PRG/issues/17#issuecomment-222246513, or mute the thread https://github.com/notifications/unsubscribe/AAlVdBLfTRxbUMx5o-GzKw3hpqQFWOkWks5qF1TrgaJpZM4Im975.

AlexanderDilthey commented 8 years ago

Yes, this seems right.

mengyao commented 8 years ago

Dear Alexander,

Hope I could get some help from you for understanding the output of HLA*PRG.

I’ve run your program HLA*PRG on NA12878 and get this following results:

Locus Chromosome Allele Q1 Q2 AverageCoverage CoverageFirstDecile MinimumCoverage proportionkMersCovered LocusAvgColumnError LocusMinimumColumnErrorP A 1 A_01:01:01:01;A_01:01:01:02N;A_01:01:38L;A_01:01:51;A_01:04N;A_01:103;A_01:107;A_01:109;A_01:132;A_01:141;A_01:142;A_01:22N;A_01:32;A_01:37;A_01:45;A_01:56N;A_01:81;A_01:87N 1 -135 48.7887 35 31 1 0.00398631 0 A 2 A_11:01:01;A_11:01:46;A_11:01:47;A_11:01:49;A_11:01:52;A_11:01:53;A_11:100;A_11:102;A_11:108;A_11:120;A_11:124;A_11:126;A_11:129;A_11:142;A_11:154;A_11:21N;A_11:69N;A_11:86 1 -135 48.7887 35 31 1 0.00398631 0 B 1 B_08:01:01;B_08:01:14;B_08:01:20;B_08:109;B_08:19N 1 -81 45.8523 38 34 1 0.00310451 0 B 2 B_56:01:01;B_56:24;B_56:40 1 -81 45.8523 38 34 1 0.00310451 0 …

For example the 2nd allele of gene B, it gives the typed alleles: B_56:01:01;B_56:24;B*56:40.

Are the 3 alleles B_56:01:01, B_56:24 and B_56:40 have the same highest probability? or they are ordered by probabilities (B_56:01:01 has the highest probability)?

Many thanks,

Mengyao

On May 31, 2016, at 11:04 PM, Alexander Dilthey notifications@github.com wrote:

Closed #17 https://github.com/AlexanderDilthey/MHC-PRG/issues/17.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/AlexanderDilthey/MHC-PRG/issues/17#event-677875393, or mute the thread https://github.com/notifications/unsubscribe/AAlVdEA60kQaRz12nRr1p8L-m2hgyHQxks5qHPavgaJpZM4Im975.

AlexanderDilthey commented 8 years ago

Hi Mengyao,

For the purpose of HLA typing at G group resolution (also sometimes called "6-digit G", http://hla.alleles.org/alleles/g_groups.html), alleles with identical sequence over the peptide binding site (exons 2 and 3 for HLA class I, exon 2 for HLA class II) are considered identical.

As HLA*PRG carries out typing at G group resolution, we also group alleles together by PBS sequence. In the context of your example, these 3 alleles would have the same PBS sequence and would therefore be considered to be one allele.

There might sometimes be minor differences between G groups as defined by IMGT and the groups emitted by HLA*PRG; this is because our grouping happens in graph space, which might sometimes lead to minor differences. Conceptually, however, they are the same thing.

If you update to the most recent version, the program will automatically generate G group output; this might be easier to parse.