joaopalotti / trectools

A simple toolkit to process TREC files in Python.
https://pypi.python.org/pypi/trectools
BSD 3-Clause "New" or "Revised" License
163 stars 32 forks source link

Close #37 #38

Closed lgienapp closed 2 years ago

lgienapp commented 2 years ago

This PR introduces changes to the way run- and qrel-files are loaded from disk/memory (#37). The goal is to (1) have forced typing and column name validation with default arguments; and (2) be able to circumvent both automatic checks if needed with custom arguments.

Flexible column naming

Tl;dr: read methods now all have a header argument that allows to specify custom column names. load methods have a mapping argument to optionally map external columns to the internal representation.

Forced Typing & Validation

Tl;dr: Columns with default names have forced types. This can be circumvented by using the header/mapping argument during data loading, as custom column names are not subject to typechecking.

Unittests

Tl;dr: unittest adaptation

joaopalotti commented 2 years ago

Looks awesome, @lgienapp! Thanks for your PR!