Closed Meravbraitbard closed 5 years ago
In theory, yes, we can use this data, however it is not a format that I have seen before. What is the value inside the parentheses, e.g.: (1-1), (2-2)? Also, is the DHDD-S column the %D value and the standard deviation?
Is this a standard format (straight from HDX/MS analysis software)? If so, I am happy to create a function in the code to handle this format of data.
Otherwise, the minimal format, HDX_columns, requires is of the following format:
sequence, start_res, end_res, time, D_inc
AEDMAADEVTAPPRKVLIISAGASHSVAL, 2, 30, 10.0, 15.20
Hi again, I tried to create a script and change the input that will be like the required input. and I am attaching the new input. but, it's not working, what is missing? I am attaching also the script I am using. Thanks
Hi again, I tried to create a script and change the input that will be like the required input. and I am attaching the new input. but, it's not working, what is missing? I am attaching also the script I am using. Thanks
Hi. The HDXWorkbench macro is only useful for data originating from HDXWorkbench. It is a very specific format that contains two different protein states.
To work with your HDX data, remove the entire header from the data file and read it with the hxio.import_HXcolumns()
parser. The values from the header values can be inputted in the running script (shown below). NOTE, I made a mistake in my above comment and the sequence
column should be named peptide_seq
.
New files are here: hdx_files.zip
So this was your original data file (truncated):
experiment attribute,value
Experiment name,DHDDS_HDX
Offset,0
Deuterium solution concentration,0.9
Experiment Protein Sequence, GSGSGSGMSWIKEGELSLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVERQEGHSQGFNKLAETLRWCLNLGILEVTVYAFSIENFKRSKSEVDGLMDLARQKFSRLMEEKEKLQKHGVCIRVLGDLHLLPLDLQELIAQAVQATKNYNKCFLNVCFAYTSRHEISNAVREMAWGVEQGLLDPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSHSCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQKARDMYAEERKRQQLERDQATVTEQLLREGLQASGDAQLRRTRLHKLSARREERVQGFLQALELKRADWLARLGTASA
sequence,start_res,end_res,time,D_inc
GSGSGSGMSW,1,10,0,0.00
Instead, the file should simply be:
peptide_seq,start_res,end_res,time,D_inc
GSGSGSGMSW,1,10,0,0.00
To input this data into the method, the lines in the running script should look like this:
inseq = 'GSGSGSGMSWIKEGELSLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVERQEGHSQGFNKLAETLRWCLNLGILEVTVYAFSIENFKRSKSEVDGLMDLARQKFSRLMEEKEKLQKHGVCIRVLGDLHLLPLDLQELIAQAVQATKNYNKCFLNVCFAYTSRHEISNAVREMAWGVEQGLLDPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSHSCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQKARDMYAEERKRQQLERDQATVTEQLLREGLQASGDAQLRRTRLHKLSARREERVQGFLQALELKRADWLARLGTASA'
# Initialize model
sys = system.System(output_dir=outputdir, noclobber=False)
mol = sys.add_macromolecule(inseq, "Molecule")
# Import data
dataset = hxio.import_HXcolumns(workbench_file, # path to the file
inseq, # Target sequence
name="Data", # A name in case you have many data files
percentD=True, # Percent rather than absolute deuterium
conditions=None, # Standard conditions (293K, pH=7.0)
error_estimate=5.0, # A guess on the error in %D units
n_fastamides=2, #Number of fast exchanging amides that are disregarded in each peptide
offset=0) #Sequence and peptide numbering offset
# Each molecule has one state already defined, the Apo.
state = mol.get_apo_state()
# Add the dataset to this state.
state.add_dataset(dataset)
# Define the HDX representation model. ResidueGridModel models the protection factor at each residue along a finite grid.
output_model = state.set_output_model(model.ResidueGridModel(state, grid_size=num_exp_bins))
# Initialize output
sys.output.initialize_output_model_file(state, output_model.pf_grids)
The updated files are contained here. I have also changed the annealing protocol in this script to a new one that is significantly superior to the original method.
DHDDS_results_HDX-Mar19_percD.txt Hi, I am working on the first time with HDX data and i tried to use the HDXWorkbench example script. but the data I have looks a little bit different from the input in the example (attached). Is there a way to fit my data to the script? Thanks!