What steps will reproduce the problem?
1. open a file with a "PG" field in the "RG" tag
2. open a file with a "PP" field in the "PG" tag
What is the expected output? What do you see instead?
An open file, but you get:
ValueError: unknown field code 'PG' in record 'RG'
What version of the product are you using? On what operating system?
version 0.6, but is present in 0.7 as well on python 2.6
Please provide any additional information below.
Not all fields available within the 1.4 sam specification are known within
pysam. The PG field within the RG tag is used to describe which program is used
to create the read group, therefore it is different from the PG tag that
describes the programs used to create this BAM (and not only one of the read
groups)
All fields are available when you replace the header description section within
the pysam/csamtools.pyx sourcecode:
# valid types for sam headers
VALID_HEADER_TYPES = { "HD" : dict,
"SQ" : list,
"RG" : list,
"PG" : list,
"CO" : list }
# order of records within sam headers
VALID_HEADERS = ("HD", "SQ", "RG", "PG", "CO" )
# type conversions within sam header records
VALID_HEADER_FIELDS = { "HD" : { "VN" : str, "SO" : str, "GO" : str },
"SQ" : { "SN" : str, "LN" : int, "AS" : str, "M5" : str, "UR" : str, "SP" : str },
"RG" : { "ID" : str, "SM" : str, "LB" : str, "DS" : str, "PU" : str, "PI" : str,
"CN" : str, "DT" : str, "PL" : str, "FO" : str, "KS" : str, "PG" : str },
"PG" : { "PN" : str, "ID" : str, "VN" : str, "CL" : str, "PP" : str }, }
# output order of fields within records
VALID_HEADER_ORDER = { "HD" : ( "VN", "SO", "GO" ),
"SQ" : ( "SN", "LN", "AS", "M5" , "UR" , "SP" ),
"RG" : ( "ID", "SM", "LB", "DS" , "PU" , "PI" , "CN" , "DT", "PL", "FO", "KS", "PG" ),
"PG" : ( "PN", "ID", "VN", "CL", "PP" ), }
Original issue reported on code.google.com by bratdak...@gmail.com on 21 Dec 2012 at 9:52
Original issue reported on code.google.com by
bratdak...@gmail.com
on 21 Dec 2012 at 9:52