rajcscw / nlp-gym

NLPGym - A toolkit to develop RL agents to solve NLP tasks.
MIT License
192 stars 19 forks source link
gym-environment nlp nlp-tasks rl-agents

NLPGym CircleCI

NLPGym is a toolkit to bridge the gap between applications of RL and NLP. This aims at facilitating research and benchmarking of DRL application on natural language processing tasks.

The toolkit provides interactive environments for standard NLP tasks such as sequence tagging, question answering, and sequence classification.

Sequence Tagging Question Answering Multi-label Classification

The environments provide standard RL interfaces and therefore can be used together with most RL frameworks such as baselines, stable-baselines, and RLLib.

Furthermore, the toolkit is designed in a modular fashion providing flexibility for users to extend tasks with their custom data sets, observations, and reward functions.


For more details with respect to observation, reward functions and featurizers, refer to our paper NLPGym- A toolkit for evaluating RL agents on Natural Language Processing Tasks which will be presented at Wordplay: When Language Meets Games @ NeurIPS 2020


Cite

If you use this repository for your research, please cite with following bibtex:

@misc{ramamurthy2020nlpgym,
      title={NLPGym -- A toolkit for evaluating RL agents on Natural Language Processing Tasks}, 
      author={Rajkumar Ramamurthy and Rafet Sifa and Christian Bauckhage},
      year={2020},
      eprint={2011.08272},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Install

Using pip:

pip install nlp-gym

To install also the dependencies for using demo scripts

pip install nlp-gym["demo"]

Alternatively from source:

git clone https://github.com/rajcscw/nlp-gym.git
cd nlp-gym
pip install .

To install also the dependencies for using demo scripts:

pip install .["demo"]

Usage

The environments follow standard gym interface and following script demonstrates a question answering environment with a random action-taking agent.

from nlp_gym.data_pools.custom_question_answering_pools import QASC
from nlp_gym.envs.question_answering.env import QAEnv

# data pool
pool = QASC.prepare("train")

# question answering env
env = QAEnv()
for sample, weight in pool:
    env.add_sample(sample)

# play an episode
done = False
state = env.reset()
total_reward = 0
while not done:
    action = env.action_space.sample()
    state, reward, done, info = env.step(action)
    total_reward += reward
    env.render()
    print(f"Action: {env.action_space.ix_to

To train a DQN agent for the same task:

from nlp_gym.data_pools.custom_question_answering_pools import QASC
from nlp_gym.envs.question_answering.env import QAEnv
from nlp_gym.envs.question_answering.featurizer import InformedFeaturizer
from stable_baselines.deepq.policies import MlpPolicy as DQNPolicy
from stable_baselines import DQN
from stable_baselines.common.env_checker import check_env

# data pool
data_pool = QASC.prepare(split="train")
val_pool = QASC.prepare(split="val")

# featurizer
featurizer = InformedFeaturizer()

# question answering env
env = QAEnv(observation_featurizer=featurizer)
for sample, weight in data_pool:
    env.add_sample(sample, weight)

# check the environment
check_env(env, warn=True)

# train a MLP Policy
model = DQN(env=env, policy=DQNPolicy, gamma=0.99, batch_size=32, learning_rate=1e-4,
            double_q=True, exploration_fraction=0.1,
            prioritized_replay=False, policy_kwargs={"layers": [64, 64]},
            verbose=1)
model.learn(total_timesteps=int(1e+4))

Further examples to train agents for other tasks can be found in demo scripts