bloomsburyai / question-generation

Neural text-to-text question generation
MIT License
216 stars 52 forks source link

Does this support parallel execution ? #22

Closed pidugusundeep closed 5 years ago

pidugusundeep commented 5 years ago

Where in can pass multiple answers parallelly and get the questions?

tomhosking commented 5 years ago

Have a look at this function in the demo

pidugusundeep commented 5 years ago

@tomhosking what would be a sample request JSON for that?

tomhosking commented 5 years ago

I believe it's something like this:

{
  "queries": [
    [context, answer1, answer1 position],
    [context, answer2, answer2 position]
    ...
  ]
}
pidugusundeep commented 5 years ago

Thanks got it to work.

pidugusundeep commented 5 years ago

How faster does it run when you send answers more than 32 ?? do you have any time stats ?? on the runtime for process completion? I was able to run but I don't see any improvement in the time.

tomhosking commented 5 years ago

Inputs are batched into a max of 32 examples - if you have a GPU these will be processed in parallel, otherwise I don't really know how efficient TensorFlow is at parallelisation on CPU. If you submit more than 32 examples then there will be n//32 calls to the model.

pidugusundeep commented 5 years ago

@tomhosking Does increasing the batch size still work? I tried changing it to 50 instead of 32 and it throws me

get_q_batch
    ctxt_feats[0] = np.array(ctxt_feats[0], dtype=bytes)
IndexError: list index out of range

I was passing 600 answers and contexts as an input.

tomhosking commented 5 years ago

Is this for the demo? The batch size is hard coded at 32: https://github.com/bloomsburyai/question-generation/blob/master/src/demo/app.py#L64

Otherwise, yes changing the batch size should work fine.

pidugusundeep commented 5 years ago

I tried changing it to 50 as i have a lot of data so it's throwing me the list index out of range( as shown in the earlier comment) can you please check and let me know

tomhosking commented 5 years ago

Is this happening in the demo or training script?

tomhosking commented 5 years ago

Also, as discussed, unless you're running this on a GPU then there's probably no advantage to using a larger batch size.

pidugusundeep commented 5 years ago

https://github.com/bloomsburyai/question-generation/blob/7148af1ba8cffe5ba56166e02bf5aef728f6fa83/src/demo/instance.py#L38

right here i suppose.

tomhosking commented 5 years ago

Is this for the demo? The batch size is hard coded at 32: https://github.com/bloomsburyai/question-generation/blob/master/src/demo/app.py#L64.

The batch size in the demo is hard coded to 32 - you'll need to change this either to the value you're using, or to FLAGS.batch_size

pidugusundeep commented 5 years ago

I changed here https://github.com/bloomsburyai/question-generation/blob/af189b1e43b5fca3f6a6f5d7e40362a4235c73fb/src/demo/app.py#L64-L66

Do I still need to change the flag as well? as it's not being used anywhere as far as I see.

tomhosking commented 5 years ago

Yeah I think you're right - the model works out the batch size from the inputs I think.

You'll also need to modify a few lines around L64 where 32 is also hard coded.

pidugusundeep commented 5 years ago

@tomhosking Yeah did that and changed everything to 50 as of my requirement but it still gives me an error is it hard coded in any other place?

tomhosking commented 5 years ago

Hmm, not that I can think of. Did you also try changing the --batch_size flag? What error gets thrown?

pidugusundeep commented 5 years ago

Ok let me check and give you an update. thanks for the quickest response 😎