Bias In Retriever towards Long Texts

I tried the retriever module for getting documents related to the question but unfortunately almost everytime long documents were suggested as the best matched.

I tried to find out whether the tf vector is normalized in the compressed sparse matrix creation but couldn't.

Can someone help me whether I am right or wrong ? And has anyone noticed this ?

facebookresearch / DrQA

Bias In Retriever towards Long Texts #254