Kortemme-Lab / sequence-tolerance

A sequence tolerance benchmark capture containing the benchmark dataset and benchmarked protocol captures.
https://kortemmelab.ucsf.edu/benchmarks
MIT License
1 stars 0 forks source link

specify input/output files for running sequence tolerance analysis #11

Closed momeara closed 9 years ago

spiderbaby commented 9 years ago

Hi,

In analysis/README.rst, the example command line is:

cd output/sample
R
> source("../../analysis/sequence_tolerance.R")
> process_specificity()

This is the way that the original analysis script was written and has been available since the original publication. I'm guessing you've seen the command lines above but would like to specify the input directory and output directory on the command line. We can definitely add this after the paper submission but I would prefer to leave it until after if possible. If you think that this should be done before, I can update the script.

momeara commented 9 years ago

Actually, I was just wanted to know what files the script was going to look for and what files it would generate.

spiderbaby commented 9 years ago

Ah, understood. I've updated the text:

The main analysis is performed by an R script. Navigate to the directory with the output data from the run and then call this R script i.e.:

cd output/sample R

source("../../analysis/sequence_tolerance.R") process_seqtol()

The command above uses the default values for process_seqtol. These can be overridden and are, in order:

See the Rosetta documentation <https://www.rosettacommons.org/docs/latest/sequence-tolerance.html> or [\ 1] for more details. For example, the process_seqtol call above uses the default values and is equivalent to:

process_seqtol('.', c(1/2.5, 1/2.5, 1/2.5, 1), 0.228, c("boltzmann", "cutoff"), .5, "specificity", FALSE, TRUE)

The input files for the analysis script are generated by the sequence tolerance step:

The analysis script currently works with files generated by the Rosetta sequence tolerance application however it can be modified to work more generally in the event that different sequence tolerance protocols are added to this capture.

By default, the analysis generates the following files:

For more details, see the Smith & Kortemme 2010 paper (references below).

spiderbaby commented 9 years ago

Also, I see that Colin already added an option for input directory (dirpath).