how to test on our own data

The best solution would be to add test_reader to the Config class, and to use it in this row: https://github.com/AmitMY/chimera/blob/master/process/pre_process.py#L9 (some tweaks necessary like to do it only in test, but not in train or dev).

The simple solution is to run the training code, then to change the test set.

You can run this:

config = Config(reader=WebNLGDataReader,
                planner=neural_planner,
                reg=BertREG)
res = MainPipeline.mutate({"config": config}).execute("WebNLG", cache_name="WebNLG")

Once the model finishes training on the training dataset, you can instantiate a new TestCorpus:

config = Config(reader=YourCustomDatasetReader)
test = TestCorpusPreProcessPipeline.mutate({"config": config}).execute("CustomName", cache_name="CustomName")

And finally combine the two for translation

translate = TranslatePipeline.mutate({*res, "test-corpus": test["test-corpus"]}).execute(...)

The more advanced solution, which is also more extensible is to create your own pipeline based on all of the parts from the process directory.

Here is an example: https://github.com/AmitMY/chimera/blob/master/experiments.py

This file alone runs at least 16 experiments I can remember of different parameters like planners, regs, and decoding methods. It can easily be modified to load whatever train, or dev set you would like, and play with whatever configuration.

AmitMY / chimera

how to test on our own data #15