argument parser for main function

annaritz commented 4 years ago

We're now overwriting each others' main functions, and you have two main functions. We need an argument parser to distinguish these use cases. (I can do this if necessary).

TobiasRubel commented 4 years ago

I'm happy to do it. main_2 was built just for parameter sweeps, and can be removed without causing any problems.

What overwriting are we doing? Do you have in mind passing the METHODS list, amongst other things?

annaritz commented 4 years ago

We should able to do any of the following without modifying hard-coded variables:

Run a single method on a single pathway
Run multiple methods on a single pathway
Run a single method on all pathways
Compute PR for all existing datafiles
Plot PR for all existing data files

...anything else? Your code is already nicely set-up for modifying values of k for HybridLinker, but we will need to also account for different values of PathLinker k, different values of ResponseNet gamma (there will also be a PCSF omega parameter).

annaritz commented 4 years ago

A --run_all or --pr_all or --plot_all could be useful.

TobiasRubel commented 4 years ago

I took a stab at updating main.py to have a command line interface:

$ python3 main.py -h
usage: main.py [-h] [--pr_all] [--plot_all] [-p PATHWAYS [PATHWAYS ...]] [-m METHODS [METHODS ...]] [-k K] [-y Y]

optional arguments:
  -h, --help            show this help message and exit
  --pr_all              Compute Precision/Recall plots for all data in DEST_PATH
  --plot_all            Plot Precision/Recall plots for all data in DEST_PATH
  -p PATHWAYS [PATHWAYS ...], --pathways PATHWAYS [PATHWAYS ...]
                        A list of pathways to make predictions for. Possible options are: IL6 Leptin Oncostatin_M IL3 TSLP IL1 BDNF NP_pathways.zip Hedgehog IL TCR CRH FSH AndrogenReceptor all RANKL TNFalpha Wnt BCR Alpha6Beta4Integrin
                        Notch Prolactin KitReceptor ID IL4 TSH RAGE IL5 TGF_beta_Receptor EGFR1 TWEAK IL2 IL9
  -m METHODS [METHODS ...], --methods METHODS [METHODS ...]
                        A list of algorithms to run. Possible options are: run_ResponseNet run_BowtieBuilder run_ShortestPaths run_PathLinker run_LocPL run_PerfectLinker_edges run_PerfectLinker_nodes run_HybridLinker run_HybridLinker_SP
                        run_HybridLinker_BFS run_HybridLinker_BFS_Weighted run_HybridLinker_DFS_Weighted run_HybridLinker_paramsweep all
  -k K                  k value to pass to PathLinker. Defaults to 10,000.
  -y Y                  gamma value to pass to ResponseNet. Defaults to 20.

This was my first time using argparse, so I didn't end up with what is probably the most intuitive thing. To run, e.g. ResponseNet on e.g. Wnt, run

python3 main.py -p Wnt -m run_ResponseNet

To run all methods on all pathways and generate precision recall plots, run

python3 main.py --pr_all --plot_all --pathways all --methods all

The code fetches which methods exist by reading its own source code and searching for a regex:

    with open('main.py','r') as f:
        METHODS = re.findall('run_.*(?=\()',f.read())[:-1]

so implementing new methods is just a matter of writing a run_xx() method.

It would be useful to add a feature s.t. we could choose to generate PR data/plot PR data for a subset of the predictions (rather than solely all of them). Because of this, as well as general unease about the structure of the interface, I'm leaving the issue open for now, but the functionality is in the main branch.

TobiasRubel / Pathway-Reconstruction-Tools

argument parser for main function #6