DiltheyLab / HLA-LA

Fast HLA type inference from whole-genome data
GNU General Public License v3.0
124 stars 42 forks source link

Sample names with strange characters #28

Closed hcurley2 closed 4 years ago

hcurley2 commented 4 years ago

Hi

I am using HLA-LA within the 1,000 genomes project environment and unfortunately sample names within the BAMs are for example LP000000-DNAA01 and so your program gives an error. I can reheader using samtools to LP00000DNAA01 and then the program works fine (but in order to do this I have to copy the bam into a writeable file by me first...). As I am trying to run this on thousands of bams, this is becoming very time consuming and memory gobbling! Is there anything I can do to modify the code to allow - and/or into sample names?

Any help would be very appreciated!

Helen

AlexanderDilthey commented 4 years ago

Hi Helen,

Not sure whether this is what you need, but you can specify arbitrary sample names via the --sampleID parameter (i.e. you could do --BAM LP000000-DNA_A01.bam and --sampleID LP00000DNAA01. Would this solve your problem?

Best,

Alex