jason-weirather / hla-polysolver

Fork of the Polysolver project
Other
29 stars 18 forks source link

HLA typing testing discordance #7

Closed wclee47 closed 6 years ago

wclee47 commented 6 years ago

Hi Jason,

First of all, thank you for creating this great conda environment.

I followed an instruction and tried to do testing using test.bam included in the polysolver package. However, I obtained different result as following.

winners1 hla_a_24_02_01_01 hla_b_39_01_01_02l hla_c_07_01_05 winners2 hla_a_24_02_01_01 hla_b_39_01_01_03 hla_c_06_02_08

HLA-A alleles are okay, but one of the HLA-B and HLA-C alleles does not match the ones from "orig.winners.hla.txt".

Have you tested this testing functionality and did your result match?

I appreciate your help!

Won-Chul

jason-weirather commented 6 years ago

Hi Won-Chul @wclee47, I think I did notice this discrepancy when I ran it. Sorry I'm having trouble pulling up a site that explains all the numerical fields in the class i haplotypes, but I think I wasn't too alarmed because all the the first two digits were called correctly for all of them, and for my purposes, I thought that was close enough. (I could be wrong about the importance of those end numbers, but i think those are further subtypes).

I would put out there that Sachet has done further updating to his docker https://hub.docker.com/r/sachet/polysolver/ and this is what should be run to be executing a "current" polysolver. Recently I've gained access to a cluster that permits Docker so I've been using Sachet's v4 tagged docker.

wclee47 commented 6 years ago

Jason, thanks so much for the prompt answer. I attempted to run HLALOH which requires HLA types inferred by Polysolver. I guess that the difference can be minimal if the alleles differ with last two digits. I will proceed with the current version, but it may be great to make a comment about this because the first thing that users do is to test this functionality. Anyway, I really appreciate your help!

jason-weirather commented 6 years ago

Thanks @wclee47 I'll edit the readme now to put a prominent caution about this and link to that docker site. Good luck, I made this to run HLALOH, but I still haven't made it to that point yet. I hope it makes you some interpretable results.

wclee47 commented 6 years ago

Hi Jason,

I am trying to run Polysolver on Docker as you suggested, but having some trouble. If you can help me, it will be much appreciated.

I pulled Polysolver:v4 and ran Docker with following command:

docker run -i -t --name polysolver sachet/polysolver:v4

And I tried to test the example included with Polysolver package with following command:

scripts/shell_call_hla_type test/test.bam Unknown 1 hg19 STDFQ 0 test

But it complained with following messages:

... scripts/shell_call_hla_type: 62: [: 1: unexpected operator scripts/shell_call_hla_type: 73: [: 0: unexpected operator ... scripts/shell_call_hla_type: 109: [: hg19: unexpected operator scripts/shell_call_hla_type: 119: [: hg19: unexpected operator ...

I think this may be related to shell programming syntax.

Do you have any idea why it happened and any workaround for this?

Thanks,

Won-Chul

2018-05-15 14:24 GMT-05:00 Jason L Weirather notifications@github.com:

Thanks @wclee47 https://github.com/wclee47 I'll edit the readme now to put a prominent caution about this and link to that docker site. Good luck, I made this to run HLALOH, but I still haven't made it to that point yet. I hope it makes you some interpretable results.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jason-weirather/hla-polysolver/issues/7#issuecomment-389284159, or mute the thread https://github.com/notifications/unsubscribe-auth/AGW97sNyXqdoRx_DF4IlnJVJg3oEOMuOks5tyyuIgaJpZM4UAFSK .

-- Won-Chul Lee, Ph.D. Postdoctoral Fellow The University of Texas MD Anderson Cancer Center 1881 East Road, 3SCR5.4101 Unit 1954 Houston, Texas 77054 WLee6@mdanderson.org

jason-weirather commented 6 years ago

Hi @wclee47 I'll do my best to give you some suggestions, but I'd recommend contacting Sachet directly if a solution is not forthcoming from my experience.

I've gotten his docker working in a WDL pipeline but even there I had to make a few tweaks along the way to get it to run. I put some snipits from my WDLs below since they aren't public yet. If you aren't familiar with WDL, thats okay but just keep in mind that those commands are being executed inside the docker environment. The biggest hang ups I hit getting his docker to run were A) it requires chromosome names in the ensembl format (without the "chr" in them), and B) it requires a SAMTOOLS_DIR environment variable to be set inside the docker. Perhaps he's fixed this, but I'm using a version of his docker that I've frozen because I need it to not be changing for reproducibility purposes.

  1. I must preprocess the data to strip the "chr" strings from my chromosome names. This is in all the bam files as well. I have a docker that contains a tool for this.

vacation/polysolverhelper:1.0.0

which I execute in the wdl as

    python /remove_chr_substring_bam.py ${input_bam} output.bam
    samtools index output.bam
  1. I run hla_type

I am using my own copy of Sachet's polysolver because I wanted to garuntee the version is frozen vacation/polysolver:v4

Then i execute in the wdl

        echo $(pwd)
        export SAMTOOLS_DIR="/home/polysolver/binaries"
        bash /home/polysolver/scripts/shell_call_hla_type "${bam}" ${race} ${includeFreq} ${build} STDFQ ${insertCalc} $(pwd)

Notice that a SAMTOOLS_DIR must be set. This environment variable is required by polysolver but it was not set in the docker, so I had to do it myself here.

  1. I run hla_type_mutations
        export SAMTOOLS_DIR="/home/polysolver/binaries"
        bash /home/polysolver/scripts/shell_call_hla_mutations_from_type "${normal_bam}" "${tumor_bam}"  ${winners} ${build} STDFQ $(pwd) ${prefix}
  1. I run annotate_hla
        export SAMTOOLS_DIR="/home/polysolver/binaries"
        bash /home/polysolver/scripts/shell_annotate_hla_mutations ${prefix} ${mutation_tar_gz} $(pwd)

Hopefully that may give you some clue to figure out your problem. I've only been running it on hg38.

wclee47 commented 6 years ago

Hi Jason,

What a kind and thoughtful answer! Thanks so much for your prompt reply.

I figured it out that scripts are using Bourne shell if we do not specify anything. Bourne shell does not recognize some of the grammar in the script. I realized that you ran the scripts with "bash" and I tested (it worked in my cases!).

Thank you so much again! I really appreciate!

Best regards,

Won-Chul

2018-05-24 16:49 GMT-05:00 Jason L Weirather notifications@github.com:

Hi @wclee47 https://github.com/wclee47 I'll do my best to give you some suggestions, but I'd recommend contacting Sachet directly if a solution is not forthcoming from my experience.

I've gotten his docker working in a WDL pipeline but even there I had to make a few tweaks along the way to get it to run. I put some snipits from my WDLs below since they aren't public yet. If you aren't familiar with WDL, thats okay but just keep in mind that those commands are being executed inside the docker environment. The biggest hang ups I hit getting his docker to run were A) it requires chromosome names in the ensembl format (without the "chr" in them), and B) it requires a SAMTOOLS_DIR environment variable to be set inside the docker. Perhaps he's fixed this, but I'm using a version of his docker that I've frozen because I need it to not be changing for reproducibility purposes.

  1. I must preprocess the data to strip the "chr" strings from my chromosome names. This is in all the bam files as well. I have a docker that contains a tool for this.

vacation/polysolverhelper:1.0.0

which I execute in the wdl as

python /remove_chr_substring_bam.py ${input_bam} output.bam
samtools index output.bam
  1. I run hla_type

I am using my own copy of Sachet's polysolver because I wanted to garuntee the version is frozen vacation/polysolver:v4

Then i execute in the wdl

    echo $(pwd)
    export SAMTOOLS_DIR="/home/polysolver/binaries"
    bash /home/polysolver/scripts/shell_call_hla_type "${bam}" ${race} ${includeFreq} ${build} STDFQ ${insertCalc} $(pwd)

Notice that a SAMTOOLS_DIR must be set. This environment variable is required by polysolver but it was not set in the docker, so I had to do it myself here.

  1. I run hla_type_mutations

    export SAMTOOLS_DIR="/home/polysolver/binaries"
    bash /home/polysolver/scripts/shell_call_hla_mutations_from_type "${normal_bam}" "${tumor_bam}"  ${winners} ${build} STDFQ $(pwd) ${prefix}
  2. I run annotate_hla

    export SAMTOOLS_DIR="/home/polysolver/binaries"
    bash /home/polysolver/scripts/shell_annotate_hla_mutations ${prefix} ${mutation_tar_gz} $(pwd)

Hopefully that may give you some clue to figure out your problem. I've only been running it on hg38.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jason-weirather/hla-polysolver/issues/7#issuecomment-391876175, or mute the thread https://github.com/notifications/unsubscribe-auth/AGW97lt0zqVlHPAWei-GXpb1uTo3LhKfks5t1yrtgaJpZM4UAFSK .

-- Won-Chul Lee, Ph.D. Postdoctoral Fellow The University of Texas MD Anderson Cancer Center 1881 East Road, 3SCR5.4101 Unit 1954 Houston, Texas 77054 WLee6@mdanderson.org