dice-group / Palmetto

Palmetto is a quality measuring tool for topics
GNU Affero General Public License v3.0
209 stars 36 forks source link

POST request with items containing _ (underscore) returns unexpected results #10

Closed earthquakesan closed 7 years ago

earthquakesan commented 7 years ago

POST request for getting document frequencies using public endpoint at: http://palmetto.aksw.org/palmetto-webapp/service/df Returns unexpected number of results if one of the items (words) in a query contains underscore (_) character. For example this query works as expected:

http://palmetto.aksw.org/palmetto-webapp/service/df?words=cat

However this query fails:

http://palmetto.aksw.org/palmetto-webapp/service/df?words=cat_

The same for the complex terms such as:

http://palmetto.aksw.org/palmetto-webapp/service/df?words=foundation_year

Expected results would be an empty set. That is if query contains no data, first 4 bytes should be empty. Right now, if query contains a term like this somewhere in the middle, e.g.:

http://palmetto.aksw.org/palmetto-webapp/service/df?words=cat%20foundation_year%20dog

It will return two results, one for cat and one for dog, but the middle term would be ignored completely. As it is right now such cases requires special parsing, which should not be the case.

MichaelRoeder commented 7 years ago

The problem seems to be caused by the Spring framework. A local JUnit test works fine while a call to the online service does not work as expected. Thus, the default handling of method parameters of Spring seems to remove words containing an underscore.