seominjoon / denspi

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (DenSPI)
https://nlp.cs.washington.edu/denspi
Apache License 2.0
200 stars 26 forks source link

Handle short sentences #2

Closed seominjoon closed 5 years ago

jhyuklee commented 5 years ago

Partially resolved using concatenation.

raman-r-4978 commented 4 years ago

Hi @jhyuklee May I know what do you mean by concatenation? Is it a text concatenation or vector concatenation?

Please refer https://github.com/uwnlp/denspi/issues/13 for more details

jhyuklee commented 4 years ago

Hi @RamanRajarathinam. We concatenated short sentences into a single paragraph (an input to BERT), then performed the indexing. This resolved the short sentence issues.

raman-r-4978 commented 4 years ago

Oh okay.. But may I know why model is not performing as expected when the input text is short?

jhyuklee commented 4 years ago

That's because the model was trained on SQuAD which usually contains a passage longer than a sentence.

raman-r-4978 commented 4 years ago

So the only solution is to concatenate different texts? or Is there any other way to solve this?

raman-r-4978 commented 4 years ago

Can you also please comment on this issues https://github.com/uwnlp/denspi/issues/9 and https://github.com/uwnlp/denspi/issues/13?

jhyuklee commented 4 years ago

So the only solution is to concatenate different texts? or Is there any other way to solve this? => For now, yes but I guess there can be plenty of other solutions that can de-bias the length bias like augmenting short passage QA training pairs.