Questions like What is 802.11 or What is P=NP? are poorly parsed because they contain expressions like 802.11 or P=NP.
Possible solution for such cases: apply the algorithm used with quotations:
identify "strange" expressions (an expression = a sequence of letters without space, "strange" = contain a strange symbol : ., =, ...). Ex: 802.11, P=NP,... (be careful: do not take P=NP?)
replace them by a random string
parse with the stanford parser
replace the random string by the initial word
If someone wants to implement this, please do it in branch reverse_predicates in file preprocessingMerge.py.
Questions like
What is 802.11
orWhat is P=NP?
are poorly parsed because they contain expressions like802.11
orP=NP
.Possible solution for such cases: apply the algorithm used with quotations:
.
,=
, ...). Ex:802.11
,P=NP
,... (be careful: do not takeP=NP?
)If someone wants to implement this, please do it in branch
reverse_predicates
in filepreprocessingMerge.py
.