namsor / namsor-tools-v2

NamSor command line tools, to append gender, origin, diaspora or us 'race'/ethnicity to a CSV file.
GNU Lesser General Public License v3.0
4 stars 4 forks source link

Slightly negative score causes Python lib error #17

Closed uncopied closed 3 years ago

uncopied commented 3 years ago

It appears that the issue with the score (i.e., halting the output) lies within the following function within BatchFirstLastNameUSRaceEthnicityOut (within the openapi client module's models) :

def score(self, score):

        """Sets the score of this FirstLastNameUSRaceEthnicityOut.

        Higher score is better, but score is not normalized. Use calibratedProbability if available.   # noqa: E501

        :param score: The score of this FirstLastNameUSRaceEthnicityOut.  # noqa: E501

        :type: float

        """

        if score is not None and score > 100:  # noqa: E501

            raise ValueError("Invalid value for `score`, must be a value less than or equal to `100`")  # noqa: E501

        if score is not None and score < 0:  # noqa: E501

            raise ValueError("Invalid value for `score`, must be a value greater than or equal to `0`")  # noqa: E501

        self._score = score

I tried again on progressively smaller batches and then tried to enter a couple of names on the namsor site for a sanity check - it turns out that some of the names output a negative score (for example, see attached scores). Note that both appear to be hispanic predictions which I saw caused a bug in earlier iterations (per some commit notes on your github).

uncopied commented 3 years ago

Thank you. We have identified the root cause, where ~3% of scores go outside of the 0-100 boundary with a value slightly below 0 - which triggers an error in the API sanity check. This has been fixed in development and will be released on 25th/26th September during our next release.

Meanwhile, as a quickfix, we propose to comment out the lines raising the error : if score is not None and score > 100: # noqa: E501 raise ValueError("Invalid value for score, must be a value less than or equal to 100") # noqa: E501 if score is not None and score < 0: # noqa: E501 raise ValueError("Invalid value for score, must be a value greater than or equal to 0") # noqa: E501

Sorry about this and thanks for reporting the error,

uncopied commented 3 years ago

Fixed in v2.0.16