Created new get_results script

JamesOwers commented 5 years ago

The script baselines/get_results.py generates all figures and tables for the paper.

You can run it from a notebook too (so you can easily view all plots in one place). See example here: https://gist.github.com/JamesOwers/04a877c4f0ab474ca1d4fd8f32e0110b. (Put that in the root dir and alter the config).

apmcleod commented 5 years ago

This isn't working for me from the command line:

ModuleNotFoundError: No module named '__main__.eval_task'; '__main__' is not a package

apmcleod commented 5 years ago

Now:

(mdtk_dev) mcleod@ampc01:/misc/home/mcleod/workspace/midi_degradation_toolkit$ ./baselines/get_results.py 
Traceback (most recent call last):
  File "./baselines/get_results.py", line 345, in <module>
    main(args)
  File "./baselines/get_results.py", line 190, in main
    setting_names = [eval(ll) for ll in args.setting_names]
TypeError: 'NoneType' object is not iterable

Tbh, I just assume not have this on the main branch anyways I think. The other scripts (training, eval) are things that people will definitely want to do if they make their own model, but there's no guarantee people will want the same plots we have, the same grid searches, the same directory structures, etc.

Even for reproducability, I seriously doubt anyone will want to redo the entire grid search and plots, etc. They'd at the most just train our 1 best model with default settings and eval it, or even just eval our pre-trained models.

apmcleod commented 5 years ago

I made a few other tries at the --setting_names param, but it's unclear to me exactly how that works. I can look at this more after the deadline if you want.

(mdtk_dev) mcleod@ampc01:/misc/home/mcleod/workspace/midi_degradation_toolkit$ ./baselines/get_results.py --setting_names lr
Traceback (most recent call last):
  File "./baselines/get_results.py", line 345, in <module>
    main(args)
  File "./baselines/get_results.py", line 190, in main
    setting_names = [eval(ll) for ll in args.setting_names]
  File "./baselines/get_results.py", line 190, in <listcomp>
    setting_names = [eval(ll) for ll in args.setting_names]
  File "<string>", line 1, in <module>
NameError: name 'lr' is not defined

(mdtk_dev) mcleod@ampc01:/misc/home/mcleod/workspace/midi_degradation_toolkit$ ./baselines/get_results.py --setting_names ['lr']
Traceback (most recent call last):
  File "./baselines/get_results.py", line 345, in <module>
    main(args)
  File "./baselines/get_results.py", line 190, in main
    setting_names = [eval(ll) for ll in args.setting_names]
  File "./baselines/get_results.py", line 190, in <listcomp>
    setting_names = [eval(ll) for ll in args.setting_names]
  File "<string>", line 1, in <module>
NameError: name 'lr' is not defined

(mdtk_dev) mcleod@ampc01:/misc/home/mcleod/workspace/midi_degradation_toolkit$ ./baselines/get_results.py --setting_names np.array(['lr'])
bash: syntax error near unexpected token `('

(mdtk_dev) mcleod@ampc01:/misc/home/mcleod/workspace/midi_degradation_toolkit$ ./baselines/get_results.py --setting_names "np.array(['lr'])"
Traceback (most recent call last):
  File "./baselines/get_results.py", line 345, in <module>
    main(args)
  File "./baselines/get_results.py", line 198, in main
    formats = dict(zip(task_names, args.formats))
TypeError: zip argument #1 must support iteration

JamesOwers commented 5 years ago

I'll try making this a bit more clear when I'm running ACME v1.0 expts. I think it would be good to have a complete pipeline. Currently it works for me from within a jupyter notebook using parameters as follows -

from baselines.get_results import (construct_parser, main, 
    plot_confusion, round_to_n, plot_log_file)

# CONFIG - EDIT THESE

# location of logs and model checkpoints
output_dir = 'mlp_output'

# location of the pianoroll and command corpus datasets
in_dir = '/disk/scratch/s0816700/data/acme'

# Names of tasks (folders in output_dir)
task_names = ['task1', 'task1weighted', 'task2', 'task3',
              'task3weighted', 'task4']

# The names of variables gridsearched over for each task
sn = ['lr', 'wd', 'hid']
# setting_names = [sn, sn, sn, sn, sn, sn]
setting_names = [sn, sn, sn, sn, sn, ['lr', 'wd', 'hid', 'lay']]

formats = list({
    'task1': 'command',
    'task1weighted': 'command',
    'task2': 'command',
    'task3': 'pianoroll',
    'task3weighted': 'pianoroll',
    'task4': 'pianoroll',
}.values())
seq_len = list({
    'task1': '1000',
    'task1weighted': '1000',
    'task2': '1000',
    'task3': '250',
    'task3weighted': '250',
    'task4': '250',
}.values())
metrics = list({
    'task1': 'rev_f',
    'task1weighted': 'rev_f',
    'task2': 'avg_acc',
    'task3': 'f',
    'task3weighted': 'f',
    'task4': 'helpfulness',
}.values())
task_desc = list({
    'task1': 'Error Detection',
    'task1weighted': 'Error Detection',
    'task2': 'Error Classification',
    'task3': 'Error Location',
    'task3weighted': 'Error Location',
    'task4': 'Error Correction',
}.values())
task_desc = [tt.replace(' ', '') for tt in task_desc]
args_str = (
    f"--output_dir {output_dir} "
    f"--save_plots {output_dir} "
    f"--in_dir {in_dir} "
    f"--task_names {' '.join(task_names)} "
    f"--setting_names {' '.join([str(sn).replace(' ', '') for sn in setting_names])} "
    f"--formats {' '.join(formats)} "
    f"--seq_len {' '.join(seq_len)} "
    f"--metrics {' '.join(metrics)} "
    f"--task_desc {' '.join(task_desc)} "
    f"--splits train valid test"
#     f"--splits test"
)
parser = construct_parser()
args = parser.parse_args(args_str.split())
results, min_idx, task_eval_log, res_df, summary_tab_, confusion = main(args)

apmcleod commented 5 years ago

I can see that I suppose. I think it makes sense to just include that as an example notebook maybe? I don't think anyone will want to perform the entire grid search. At most, they're want to train the models that we found were best, but likely, they'd just want to run our pretrained model for each task it anything.

I'm also a bit unclear whether this is going in the GitHub repo only (perfectly fine with that) or in the pip install package as well (that's where I'm more uncertain).

JamesOwers commented 5 years ago

I'm also a bit unclear whether this is going in the GitHub repo only (perfectly fine with that) or in the pip install package as well (that's where I'm more uncertain).

Bit confused by this - the pip install simply makes the mdtk package usable (everything under ./mdtk). None of what is in ./baselines is 'installed. However, I would presume that, since we're not sticking this up on pypi just yet, everyone is going to clone the repo, get all the code, then use pip to installmdtk` as we'll describe in the README.

I don't plan to include an ipynb in the repo as it'll be quite large. However, I'll stick it up as a separate gist and link to it from the README i think. Currently this ipynb calls the main call function from within.

I've made a new issue (#90 ) to remind me to fix this for ACME release (whilst this isn't priority, it's definitely convenient for me to test this when I'm rerunning all the experiments on the new data!).

apmcleod commented 5 years ago

You e answered my question there. Basically I was wondering which directories are in the mdtk "package". This is probably fine then, although I still am unsure about including the entire grid search. But probably fine to include it for dexterity's sake. I can't test at the moment.

apmcleod commented 5 years ago

I can't really get my head around what this code is doing exactly.

Can someone use this like, e.g. get_results.py --model checkpoint.model --task 1 --format command ?

If not, can you give an example of how to use this (preferably from the command line) for someone who has a saved model of some kind for a given task? It looks like a lot of the code there is really for handling the format of the filenames we've chosen for the grid search, while it would possibly be better to make this more generic and simpler, then call the script in a custom loop, like:

for model in glob.glob('saved_models_task_1/*'):
    get_results.main(model=model, make_plots=True)

or something like that. Does this call the eval scripts from mdtk/evaluation.py or does it read values from a dict somewhere?

JamesOwers commented 5 years ago

Yeah I'm going to put in at checking and docs to make it obvious. Hopefully today!

On Tue, 19 Nov 2019, 02:27 Andrew McLeod, notifications@github.com wrote:

I can't really get my head around what this code is doing exactly.

Can someone use this like, e.g. get_results.py --model checkpoint.model --task 1 --format command ?

If not, can you give an example of how to use this (preferably from the command line) for someone who has a saved model of some kind for a given task? It looks like a lot of the code there is really for handling the format of the filenames we've chosen for the grid search, while it would possibly be better to make this more generic and simpler, then call the script in a custom loop, like:

for model in glob.glob('saved_models_task_1/*'): get_results.main(model=model, make_plots=True)

or something like that. Does this call the eval scripts from mdtk/evaluation.py or does it read values from a dict somewhere?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JamesOwers/midi_degradation_toolkit/pull/79?email_source=notifications&email_token=AA24Q4YWLXELF37PMOYQN5TQUNFH7A5CNFSM4JCFYKLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEMT7OA#issuecomment-555302840, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA24Q45O4B4IQP7JOT7IDP3QUNFH7ANCNFSM4JCFYKLA .

JamesOwers commented 5 years ago

Closing as will be included in ACME Release

JamesOwers / midi_degradation_toolkit

Created new get_results script #79