[FEATURE] Merging individual scored files and sub-sampling a merged file

I have added some features for merging and sub-sampling.

This can be split into two aspects:

Merging individual scored osw files
Sub-sampling a merged.osw file

1. Merging individual scored osw files

I have added to the merge function to allow for the merging of individual post scored files. \ I had several osw files from the same experiment that had scoring applied to them each, but I needed a single merged osw file to run another external analysis script. Currently merge would remove the extra score tables or call the oswr reduced merge function.

I have added a predicate to the merge function to specify if merging post scored runs. (--merge_post_scored_runs). If merging post scored runs, merge will call def merge_oswps to merge all runs, and retain all tables. (See: levels_contexts.py#L824-L1002)

2. Sub-sampling a merged.osw file

This was requested by @Matthias313. He wanted to create a sub-sampled osw from a merged.osw file, instead of sub-sampling from individual runs. I think he was worried about the comparability of sub-sampling the individual runs alone, but maybe he can comment further on his concerns.

I have added a check, to see if the input is a file containing more than one run in the RUN table. (See: levels_contexts.py#L290-L294) \ if you sub-sample a merged.osw file, the PRECURSOR table, TRANSITION table and the TRANSITION_PRECURSOR_MAPPING table are no longer present, meaning you would have to call the merge function again with a template to append those tables. \ To avoid having to perform this unnecessary step, if there are multiple runs in the supplied file to subsample, then I append these tables needed for scoring. (See: levels_contexts.py#L433-L481)

I have performed tests on individual sub-sampling and merged sub-sampling, and based on my results they seem comparable.

If there is anything else I need to add or do, or if you have any comments, please let me know.

Warm Regards,

Justin

PyProphet / pyprophet

Feature/merged subsampling #82

[FEATURE] Merging individual scored files and sub-sampling a merged file

1. Merging individual scored osw files

2. Sub-sampling a merged.osw file

Pyprophet report of the merged individually sub-sampled runs (model.osw)

Pyprophet report of applying weights to run 1

Pyprophet report of applying weights to run 2

Pyprophet report of sub-sampling a merged file (model.osw)

Pyprophet report of applying weights to merged file