nicolazzie / AffyPipe

an open-source pipeline for Affymetrix Axiom genotyping workflow on livestock species
13 stars 7 forks source link

Suggestion #1

Closed hmkim closed 9 years ago

hmkim commented 9 years ago

Dear, nicolazzie

Firstly, Thanks for your effort to support this nice pipeline.

When I use this tool, I got the result in success.

My suggestion is to add the option can get the A,T,G,C genotype in PED file.

At present, PED file has A,B,0.

Refer to below.

AlleleA = {} # add line AlleleB = {} # add line

This builds the ps2snp.txt file needed by SNPolisher.

out3=open(HERE+'/ps2snp.txt','w') skip=True probeset=0 if opt.PLINK:allps={} for line in open(Smap): if '"Affy SNP ID' in line: out3.write('probeset_id snpid\n') skip=False;continue if skip:continue line=line.replace('"','') probe,snp,nn,crom,pos,rest=line.strip().split(',',5) if not AlleleA.has_key(probe) : # add line AlleleA[probe] = line.strip().split(',')[11] # add line AlleleB[probe] = line.strip().split(',')[12] # add line if opt.PLINK:allps[probe]=(snp,crom,pos) out3.write('%s %s\n' % (probe,snp)) probeset+=1 out3.close() logit("[GOOD NEWS]: Total probe sets read: "+str(probeset)) ...

anims=False
okp=0;allids=[]
# Read Axiom call file, and write map file.
for line in open(opt.DIROUT+'/AxiomGT1.calls.txt'):
    if 'probeset_id' in line:
        allids=[x[:-4] for x in line.strip().split('\t')[1:]]
        G=[[] for x in range(len(allids))]
        anims=True;continue
    if not anims:continue
    line = line.strip().split('\t')
    probe_id = line[0] ####
    probe_a = AlleleA[probe_id] ####
    probe_b = AlleleB[probe_id] ####
    if not keeprobe.has_key(line[0]):continue
    odata=allps.get(line[0],False)
    if not odata:
        bomb('Probe set '+line[0]+' not found in Master Annotation file! Please check this!')
    file_map.write('%s %s 0 %s\n' % (odata[1],odata[0],odata[2]))
    okp+=1
    d=-1
    for x in line[1:]:
        d+=1

-- add lines

        if x == '2':
            geno_code_val = probe_a + ' ' + probe_a
        elif x == '0':
            geno_code_val = probe_b + ' ' + probe_b
        elif x == '1':
            geno_code_val = probe_a + ' ' + probe_b
        elif x == '-1':
            geno_code_val = 'N' + ' ' + 'N'
        else:
            print 'ERROR!!! : ' + x

-- add lines

        #G[d].append(geno_code[x])
        G[d].append(geno_code_val)  # add line
nicolazzie commented 9 years ago

Good idea. I've just done the push with the amended version (and README) containing your idea and most of your code. I've only altered the final part for consistency (and my own coding readability preferences).
Thanks for you suggestion!

hmkim commented 9 years ago

You're welcome. I'm so happy to contribute this project.