[FEATURE] Merging individual scored files and sub-sampling a merged file
I have added some features for merging and sub-sampling.
This can be split into two aspects:
Merging individual scored osw files
Sub-sampling a merged.osw file
1. Merging individual scored osw files
I have added to the merge function to allow for the merging of individual post scored files. \
I had several osw files from the same experiment that had scoring applied to them each, but I needed a single merged osw file to run another external analysis script. Currently merge would remove the extra score tables or call the oswr reduced merge function.
I have added a predicate to the merge function to specify if merging post scored runs. (--merge_post_scored_runs). If merging post scored runs, merge will call def merge_oswps to merge all runs, and retain all tables. (See: levels_contexts.py#L824-L1002)
2. Sub-sampling a merged.osw file
This was requested by @Matthias313. He wanted to create a sub-sampled osw from a merged.osw file, instead of sub-sampling from individual runs. I think he was worried about the comparability of sub-sampling the individual runs alone, but maybe he can comment further on his concerns.
I have added a check, to see if the input is a file containing more than one run in the RUN table. (See: levels_contexts.py#L290-L294) \
if you sub-sample a merged.osw file, the PRECURSOR table, TRANSITION table and the TRANSITION_PRECURSOR_MAPPING table are no longer present, meaning you would have to call the merge function again with a template to append those tables. \
To avoid having to perform this unnecessary step, if there are multiple runs in the supplied file to subsample, then I append these tables needed for scoring. (See: levels_contexts.py#L433-L481)
I have performed tests on individual sub-sampling and merged sub-sampling, and based on my results they seem comparable.
Pyprophet report of the merged individually sub-sampled runs (model.osw)
[FEATURE] Merging individual scored files and sub-sampling a merged file
I have added some features for merging and sub-sampling.
This can be split into two aspects:
1. Merging individual scored osw files
I have added to the
merge
function to allow for the merging of individual post scored files. \ I had several osw files from the same experiment that had scoring applied to them each, but I needed a single merged osw file to run another external analysis script. Currentlymerge
would remove the extra score tables or call theoswr
reduced merge function.I have added a predicate to the
merge
function to specify if merging post scored runs. (--merge_post_scored_runs
). If merging post scored runs,merge
will calldef merge_oswps
to merge all runs, and retain all tables. (See: levels_contexts.py#L824-L1002)2. Sub-sampling a merged.osw file
This was requested by @Matthias313. He wanted to create a sub-sampled osw from a merged.osw file, instead of sub-sampling from individual runs. I think he was worried about the comparability of sub-sampling the individual runs alone, but maybe he can comment further on his concerns.
I have added a check, to see if the input is a file containing more than one run in the RUN table. (See: levels_contexts.py#L290-L294) \ if you sub-sample a merged.osw file, the PRECURSOR table, TRANSITION table and the TRANSITION_PRECURSOR_MAPPING table are no longer present, meaning you would have to call the
merge
function again with a template to append those tables. \ To avoid having to perform this unnecessary step, if there are multiple runs in the supplied file tosubsample
, then I append these tables needed for scoring. (See: levels_contexts.py#L433-L481)I have performed tests on individual sub-sampling and merged sub-sampling, and based on my results they seem comparable.
Pyprophet report of the merged individually sub-sampled runs (model.osw)
merged_individual_subsampled_runs_model.pdf
Pyprophet report of applying weights to run 1
run_1_applied_weights.pdf
Pyprophet report of applying weights to run 2
run_2_applied_weights.pdf
Pyprophet report of sub-sampling a merged file (model.osw)
merged_subsampled_model.pdf
Pyprophet report of applying weights to merged file
merged_subsampled_applied_weights.pdf
If there is anything else I need to add or do, or if you have any comments, please let me know.
Warm Regards,
Justin