facebookresearch / ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
https://parl.ai
MIT License
10.47k stars 2.09k forks source link

How to connect my custom Task or Agent without polluting source folders of ParlAI? #285

Closed hyzhak closed 6 years ago

hyzhak commented 7 years ago

I have already read documentation http://parl.ai/static/docs/task_tutorial.html and http://parl.ai/static/docs/seq2seq_tutorial.html but still trying to find the way how could I use ParlAI in a convenient way. I don't want to put my sources inside of ParlAI I need to hold them in my own git repository with the rest of my sources. Ideally I'd like to pip install parlai and extend its functionality with my custom classes as many other frameworks do, but without jumbling sources of my project and ParlAI project together.

Shortly: How to connect my custom Task or Agent without polluting source folders of ParlAI?

jaseweston commented 7 years ago

We don't have pip install yet, but it's on our list.

You can put your other github repo inside the top level of parlAI (but it doesnt have to be in core parlai/parlai folder, and it otherwise separate) and then specify the full path with e.g. -t like that: my_repo.agents:AgentName

That's what we are doing rn, unless Alex you have better suggestions?

On Tue, Sep 5, 2017 at 5:37 PM, Eugene Krevenets notifications@github.com wrote:

I have already read documentation http://parl.ai/static/docs/ task_tutorial.html and http://parl.ai/static/docs/seq2seq_tutorial.html but still trying to find how could I use ParlAI in a convenient way. I don't want to put my sources inside of ParlAI I need to hold them in my own git repository with the rest of my sources. Ideally I'd like to pip install parlai and extend its functionality with my custom classes as many other frameworks do, but without jumbling sources of my project and ParlAI project together.

Shortly: How to connect my custom Task or Agent without polluting source folders of ParlAI?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/ParlAI/issues/285, or mute the thread https://github.com/notifications/unsubscribe-auth/AKjk-K6sj6qQepjCrGzUm1FHxwyMGccXks5sfb8sgaJpZM4PNjo_ .

alexholdenmiller commented 6 years ago

Hi @hyzhak,

Our current recommended installation does indeed recommend polluting the ParlAI source folders (this makes it easier to use the scripts we include, which use shortcuts like Jason mentioned).

One thing you could do is just write a very small wrapper in parlai/agents/my_agent/my_agent.py, and then from there require your model from your own repo. Then the only "pollution" of the ParlAI source folder is this folder. (For a task, do essentially the same thing but in parlai/tasks/my_task/).

As Jason mentioned, you could also move your whole repo to the parent-level folder and then specify agent or task paths into there by spelling out the full path there.

Of course, if you write your own training loop (rather than using ours at examples/train_model.py), then you don't need to do any of this--you can manually import your agent or task classes in that loop instead of initializing them based on command-line arguments, and those classes can be fully contained in your own repository.

hyzhak commented 6 years ago

Thanks @alexanderkjeldaas and @jaseweston for answering! I'm very appreciate your help :)

One thing you could do is just write a very small wrapper in parlai/agents/my_agent/my_agent.py, and then from there require your model from your own repo. Then the only "pollution" of the ParlAI source folder is this folder. (For a task, do essentially the same thing but in parlai/tasks/my_task/).

Sadly It doesn't work well because I don't want to touch sources of ParlAI at all. Even small file parlai/agents/my_agent/my_agent.py, will change ParlAI sources :(

As Jason mentioned, you could also move your whole repo to the parent-level folder and then specify agent or task paths into there by spelling out the full path there.

This possible work around but also looks hairy and will complicate integration. And definitely won't work with pip installation, which in your list. I hope it will come soon

Of course, if you write your own training loop (rather than using ours at examples/train_model.py), then you don't need to do any of this--you can manually import your agent or task classes in that loop instead of initializing them based on command-line arguments, and those classes can be fully contained in your own repository.

This looks like right solution here. So I'm going to try this way and give you feedback if you don't mind

alexholdenmiller commented 6 years ago

Feedback would be great--let us know how it works out!

One option is also that we could potentially modify the train_model.py loop (for example) to make it easier for you to call the main function from within your own script--that is to say, if you copy and pasted the entire file and found you only changed a line or two, we could possible make that a parameter so that it's easier to just call that loop. This would set us up better for the pip install in the future.

Thanks!

alexholdenmiller commented 6 years ago

closing for now, feel free to reopen with any feedback

Henry-E commented 6 years ago

605 This seems relevant to the integrating OpenNMT-py