Nekmo / pip-rating

Check the health of your project's requirements and get a score for each dependency.
https://docs.nekmo.org/pip-rating/
MIT License
29 stars 1 forks source link

[pip-rating The Lord of the Words] Error ISO dateformat with Python 3.10 and previous versions #46

Closed SoyGema closed 11 months ago

SoyGema commented 11 months ago

Command that causes the issue

Context

Congrats on the project and the great lighting talk at PyConES2023 ! Really great project. Thanks for contributing to the community. Keep the great work up! πŸ‘ IΒ΄m testing out under the context of The Lord of The Words project

So I go to the project folder afterpip install pip-rating and run

$ pip-rating 

Expected behavior

Hypothesis

click isodate format working

Actual behavior

It does indeed start analyzing dependencies, but at some point it shows the following error. Posting here full-trace

(.torchenv) pip-rating
Autodetected requirements file: requirements.txt
Traceback (most recent call last):
  File "/Users/gema/miniconda3/bin/pip-rating", line 8, in <module>
    sys.exit(manage())
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/management.py", line 206, in manage
    catch(cli)()
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/exceptions.py", line 45, in wrap
    fn(*args, **kwargs)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 1666, in invoke
    rv = super().invoke(ctx)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/management.py", line 77, in cli
    ctx.invoke(analyze_file, file=str(req_file.path))
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/management.py", line 160, in 
analyze_file
    results.show_results(dependencies, format_name)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/results.py", line 294, in 
show_results
    self.show_packages_results(dependencies)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/results.py", line 307, in 
show_packages_results
    global_rating_score = self.get_global_rating_score(dependencies)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/results.py", line 277, in 
get_global_rating_score
    global_rating_score = dependencies.get_global_rating_score()
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/dependencies.py", line 139, in 
get_global_rating_score
    global_rating_score = package.rating.get_global_rating_score()
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/rating.py", line 322, in 
get_global_rating_score
    [self.get_rating_score(from_package)]
  File "/Users/../miniconda3/lib/python3.10/site-packages/pip_rating/rating.py", line 318, in 
get_rating_score
    return self.rating_score
  File "/Users/../miniconda3/lib/python3.10/functools.py", line 981, in __get__
    val = self.func(instance)
  File "/Users/../miniconda3/lib/python3.10/site-packages/pip_rating/rating.py", line 291, in 
rating_score
    scores = dict(self.breakdown_scores).values()
  File "/Users/gema/miniconda3/lib/python3.10/functools.py", line 981, in __get__
    val = self.func(instance)
  File "/Users/../miniconda3/lib/python3.10/site-packages/pip_rating/rating.py", line 277, in 
breakdown_scores
    return [
  File "/Users/.../miniconda3/lib/python3.10/site-packages/pip_rating/rating.py", line 278, in <listcomp>
    (breakdown.breakdown_key, breakdown.get_score(self))
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/pip_rating/rating.py", line 146, in get_score
    dt = datetime.datetime.fromisoformat(iso_dt)
ValueError: Invalid isoformat string: '2023-09-14T18:56:29.702900Z'

Interpretation

IΒ΄m still getting familiarized with some details, but what IΒ΄m understanding ( please, donΒ΄t hesitate to tell me if IΒ΄m wrong or pointing me out to the right direction ) is that this might be related with click date conversion somehow? The mental model under the click choice hypothesis is based on part of the following trace, but I might be wrong!

  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 1666, in invoke
    rv = super().invoke(ctx)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/gema/miniconda3/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)

When I go here , what Im understanding is that you are trying to use a delta-time to determine somehow a "time-trusted-score" based on when was it updated and now?

Traceback

No response

Pip-rating version

Info

pip-rating 0.2.1 πŸ” This is the latest version. 🐍 Python version: 3.10.10 πŸ’» Platform: macOS-13.4.1-arm64-arm-64bit

Other details

I really found useful the tool verbosity, as I could see that It was going down in the analysis. Congrats for bringing transparency to OSS maintenance!

Checklist

laureanorp commented 11 months ago

I just experienced the same error, while analysing the package itsdangerous, I think.

ValueError: Invalid isoformat string: '2022-03-24T15:12:15.102192Z'

macOS 13.6, python 3.10, pip-rating 0.2.1

SoyGema commented 11 months ago

Hey @laureanorp ! Thanks for taking part of the conversation. This is great, and might entail that some packages are dealing with the same issue...

IΒ΄m trying to backtrack this and this is what IΒ΄ve got so far.

πŸ™Please note my limited knowledge about the codebase, and please see this mental model more like a an intention of being part of brainstorming a possible solution without any kind of forcing intention to implement any of those by the maintainer πŸ™ NOTES : Unclear if the package is using ISO 8601

The ISO 8601 datetime format includes milliseconds, but the fromisoformat() method in Python's datetimemodule expects the string to have exactly six digits for milliseconds. The hypothesis might be that in some libraries provided string, it appears that there are more than six digits for milliseconds (e.g., 702900).

HereΒ΄s what I think that can be done

  1. Truncate miliseconds portion to six digits ( quick fix, not very elegant ) inside get_score function . I put this however because it doesnΒ΄t entail the use of any extra external package
    iso_dt = iso_dt[:26]  # Keep the first 26 characters (up to microseconds)
  2. A more elegant solution could entail using a regular expression to extract the relevant protion and then parsing it into a datetime object ... but unclear if this might give more errors...
    
    import re

def parse_iso_datetime_with_ms(iso_dt):

Define a regex pattern to extract the datetime part with milliseconds

iso_datetime_pattern = re.compile(r'(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{6})')

match = iso_datetime_pattern.search(iso_dt)

if match:
    iso_datetime = match.group(1)
    dt = datetime.datetime.fromisoformat(iso_datetime)
    return dt

return None

def get_score(self, package_rating: "PackageRating") -> ScoreValue: iso_dt = self.get_breakdown_value(package_rating) if not iso_dt: return ScoreValue(0)

dt = parse_iso_datetime_with_ms(iso_dt)

if dt:
    dt_delta = datetime.datetime.now(datetime.timezone.utc) - dt
    for delta, score in self.scores.items():
        if dt_delta < delta:
            return ScoreValue(score)

return ScoreValue(self.default)
3. Not sure about this one, but maybe using `python-dateutil` library . But unclear if this aligns with the creator / maintainer philosophy for the package ... 

from dateutil import parser ...

Inside the get_score function

try:
    dt = parser.isoparse(iso_dt)


Happy to see more people around testing out this great project! 
Nekmo commented 11 months ago

Hi @SoyGema & @laureanorp , thank you very much for your comments and your interest in pip-rating!

I am sorry I couldn't respond sooner, I didn't have my development environment on the laptop I brought to the conference.

I have tried using Python 3.11.5 and I did not find the bug, but I did find it using 3.10.12.

Python 3.10:

$ pip-rating analyze-package --cache-dir /tmp/foo itsdangerous
Traceback (most recent call last):
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/bin/pip-rating", line 8, in <module>
    sys.exit(manage())
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/management.py", line 206, in manage
    catch(cli)()
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/exceptions.py", line 45, in wrap
    fn(*args, **kwargs)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/management.py", line 201, in analyze_package
    results.show_results(dependencies, format_name)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/results.py", line 294, in show_results
    self.show_packages_results(dependencies)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/results.py", line 307, in show_packages_results
    global_rating_score = self.get_global_rating_score(dependencies)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/results.py", line 277, in get_global_rating_score
    global_rating_score = dependencies.get_global_rating_score()
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/dependencies.py", line 139, in get_global_rating_score
    global_rating_score = package.rating.get_global_rating_score()
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/rating.py", line 322, in get_global_rating_score
    [self.get_rating_score(from_package)]
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/rating.py", line 318, in get_rating_score
    return self.rating_score
  File "/usr/lib/python3.10/functools.py", line 981, in __get__
    val = self.func(instance)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/rating.py", line 291, in rating_score
    scores = dict(self.breakdown_scores).values()
  File "/usr/lib/python3.10/functools.py", line 981, in __get__
    val = self.func(instance)
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/rating.py", line 277, in breakdown_scores
    return [
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/rating.py", line 278, in <listcomp>
    (breakdown.breakdown_key, breakdown.get_score(self))
  File "/home/nekmo/.virtualenvs/tmp-574d64e506cfa4f/lib/python3.10/site-packages/pip_rating/rating.py", line 146, in get_score
    dt = datetime.datetime.fromisoformat(iso_dt)
ValueError: Invalid isoformat string: '2022-03-24T15:12:15.102192Z'

Python 3.11:

$ pip-rating analyze-package itsdangerous
Analyzed all packages ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
πŸ“¦ Package itsdangerous: A
  β—Ύ Basic info present: +1
  β—Ύ Source repository present: +1
  β—Ύ Readme present: +1
  β—Ύ License present: +1
  β—Ύ Has multiple versions: +3
  β—Ύ Dependent projects: +6
  β—Ύ Dependent repositories: +5
  β—Ύ Stars: +3
  β—Ύ Contributors: +1
  β—Ύ Latest upload : 0
  β—Ύ First upload : +4
  β—Ύ Package in readme: +1

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Global rating score: A β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

I have consulted the documentation and it seems that the fromisoformat method changed in Python 3.11. It was my fault for not checking the changes made to that method. pip-rating should support all versions of Python from 3.8 to 3.12, but I have not correctly tested that method. https://docs.python.org/3.11/library/datetime.html?highlight=fromisoformat#datetime.datetime.fromisoformat

My suggestion is a condition that checks the Python version. If the version is 3.11 or higher, fromisoformat should be used. If it is 3.10 or lower, the strptime method (with several possible formats) or dateutil should be used. The behavior of the latter condition should be the same as fromisoformat. In the future, when the minimum version is Python 3.11, the condition for Python 3.10 version can be removed.

Thank you both so much for your help with this bug. I should have tested better with different versions of Python! Next time, I won't trust unittests as much.

Since you've helped so much, if you want you can contribute with a PR. Otherwise, I'll fix the bug tomorrow and release a new patch version as soon as possible.

A big hug to both of you. See you at the next PyConES!

SoyGema commented 11 months ago

Hey @Nekmo ! Thanks so much for your time, and for your openness towards accepting contribs πŸ‘

My suggestion is a condition that checks the Python version. If the version is 3.11 or higher, fromisoformat should be used. If it is 3.10 or lower, the strptime method (with several possible formats) or dateutil should be used. The behavior of the latter condition should be the same as fromisoformat. In the future, when the minimum version is Python 3.11, the condition for Python 3.10 version can be removed.

Can you please take a look at the function above and validate if I understood the proposal correctly to submit a PR? Happy to give kudos or submit a join contrib and/or also receive some feedback from @laureanorp (If he doesnΒ΄t mind!)

IΒ΄m aware that the best practice is to directly sub PR but It would be awesome to align from a starting point with this in the sense of making two functions: one that can encapsulate conversions-python-dependent parse_iso_datetime and the other just changing the parsing line in get_score + validate logic ... Also to include/give space somehow @laureanorp without pushing him directly into reviewing/or leaving him out of this πŸ™‚

@Nekmo EDITED for clarification : I propose 2 functions for functionality and for keeping the get_score function clean. It is indeed a very semantic and clean function already and encapsulate date_time functionality make sense for me at first thought, but I can be wrong! Let me know why you shall prefer it in one function, I also checked if there was some kind of utils.py script to see if it could be added there

import sys
from datetime import datetime
from dateutil import parser as dateutil_parser

def parse_iso_datetime(iso_dt):
    if sys.version_info >= (3, 11):
        # Python 3.11 or higher, use fromisoformat
        return datetime.fromisoformat(iso_dt)
    else:
        try:
            # Try using fromisoformat (Python 3.7+)
            return datetime.fromisoformat(iso_dt)
        except ValueError:
            # Fallback to dateutil.parser for older Python versions
            return dateutil_parser.isoparse(iso_dt)

def get_score(self, package_rating: "PackageRating") -> ScoreValue:
    iso_dt = self.get_breakdown_value(package_rating)
    if not iso_dt:
        return ScoreValue(0)

    try:
        dt = parse_iso_datetime(iso_dt)
    except ValueError:
        return ScoreValue(0)

    dt_delta = datetime.now(datetime.timezone.utc) - dt
    for delta, score in self.scores.items():
        if dt_delta < delta:
            return ScoreValue(score)

    return ScoreValue(self.default)

Looking forward to hear from you!

πš†πš‘πš’ πšπš’πš πšπš‘πšŽ π™Ώπš’πšπš‘πš˜πš— πš™πš›πš˜πšπš›πšŠπš–πš–πšŽπš› πšŠπš™πš™πš•πš’ πšπš˜πš› 𝚊 πš“πš˜πš‹ 𝚊𝚜 𝚊 πšπš’πš–πšŽ πšπš›πšŠπšŸπšŽπš•πšŽπš›? π™±πšŽπšŒπšŠπšžπšœπšŽ πšπš‘πšŽπš’ πš‘πšŽπšŠπš›πš πšπš‘πšŠπš πš’πš— πšπš‘πšŽ πš™πšŠπšœπš, πšπš‘πšŽπš’ πšπš’πšπš—'𝚝 πš‘πšŠπšŸπšŽ 𝚝𝚘 πšπšŽπšŠπš• πš πš’πšπš‘ π™Έπš‚π™Ύ πšπšŠπšπšŽπšπš’πš–πšŽ πšœπšπš›πš’πš—πšπšœ!

Nekmo commented 11 months ago

Hey @Nekmo ! Thanks so much for your time, and for your openness towards accepting contribs πŸ‘

My suggestion is a condition that checks the Python version. If the version is 3.11 or higher, fromisoformat should be used. If it is 3.10 or lower, the strptime method (with several possible formats) or dateutil should be used. The behavior of the latter condition should be the same as fromisoformat. In the future, when the minimum version is Python 3.11, the condition for Python 3.10 version can be removed.

Can you please take a look at the function above and validate if I understood the proposal correctly to submit a PR? Happy to give kudos or submit a join contrib and/or also receive some feedback from @laureanorp (If he doesnΒ΄t mind!)

IΒ΄m aware that the best practice is to directly sub PR but It would be awesome to align from a starting point with this in the sense of making two functions: one that can encapsulate conversions-python-dependent parse_iso_datetime and the other just changing the parsing line in get_score + validate logic ... Also to include/give space somehow @laureanorp without pushing him directly into reviewing/or leaving him out of this πŸ™‚

@Nekmo EDITED for clarification : I propose 2 functions for functionality and for keeping the get_score function clean. It is indeed a very semantic and clean function already and encapsulate date_time functionality make sense for me at first thought, but I can be wrong! Let me know why you shall prefer it in one function, I also checked if there was some kind of utils.py script to see if it could be added there

import sys
from datetime import datetime
from dateutil import parser as dateutil_parser

def parse_iso_datetime(iso_dt):
    if sys.version_info >= (3, 11):
        # Python 3.11 or higher, use fromisoformat
        return datetime.fromisoformat(iso_dt)
    else:
        try:
            # Try using fromisoformat (Python 3.7+)
            return datetime.fromisoformat(iso_dt)
        except ValueError:
            # Fallback to dateutil.parser for older Python versions
            return dateutil_parser.isoparse(iso_dt)

def get_score(self, package_rating: "PackageRating") -> ScoreValue:
    iso_dt = self.get_breakdown_value(package_rating)
    if not iso_dt:
        return ScoreValue(0)

    try:
        dt = parse_iso_datetime(iso_dt)
    except ValueError:
        return ScoreValue(0)

    dt_delta = datetime.now(datetime.timezone.utc) - dt
    for delta, score in self.scores.items():
        if dt_delta < delta:
            return ScoreValue(score)

    return ScoreValue(self.default)

Looking forward to hear from you!

πš†πš‘πš’ πšπš’πš πšπš‘πšŽ π™Ώπš’πšπš‘πš˜πš— πš™πš›πš˜πšπš›πšŠπš–πš–πšŽπš› πšŠπš™πš™πš•πš’ πšπš˜πš› 𝚊 πš“πš˜πš‹ 𝚊𝚜 𝚊 πšπš’πš–πšŽ πšπš›πšŠπšŸπšŽπš•πšŽπš›? π™±πšŽπšŒπšŠπšžπšœπšŽ πšπš‘πšŽπš’ πš‘πšŽπšŠπš›πš πšπš‘πšŠπš πš’πš— πšπš‘πšŽ πš™πšŠπšœπš, πšπš‘πšŽπš’ πšπš’πšπš—'𝚝 πš‘πšŠπšŸπšŽ 𝚝𝚘 πšπšŽπšŠπš• πš πš’πšπš‘ π™Έπš‚π™Ύ πšπšŠπšπšŽπšπš’πš–πšŽ πšœπšπš›πš’πš—πšπšœ!

Pull requests are always welcome! You can open a PR and I'll take a look at it soon. You can also ask me questions here :)

I agree to create a file utils.py to include the new function parse_iso_datetime, it's a good idea. The code looks good, I would only add a logger to this code:

from logging import getLogger

logger = getLogger(__name__)

def get_score(self, package_rating: "PackageRating") -> ScoreValue:
    ...
    try:
        dt = parse_iso_datetime(iso_dt)
    except ValueError:
        logger.warning("Invalid datetime received for package %s: %s", package_rating.package.name, iso_dt)
        return ScoreValue(0)

The code should not fail if it receives an unsupported date, but it should not fail silently either. Otherwise, it will be difficult to detect and fix the error if the date format changes in the future.

If you can, include or modify the existing tests to check the modification. It would be great if you could reproduce the bug in a test and then fix it. If not, don't worry, I can take care of the tests. The code should have types. If you have any questions, just let me know.

Thank you for your interest and for your time!

The Python programmer from the present ran away to the past to escape from date string conversions, but the programmers from the future traveled to our present to escape from the 2038 problem.

SoyGema commented 11 months ago

Hey @Nekmo ! I had a chance to give it a shot! Happy if you could have a look and give me some feedback tanty-de-lorean π™ΌπšŠπš›πšπš’: πšƒπš‘πšŠπš'𝚜 πš‘πšŽπšŠπšŸπš’, π™³πš˜πšŒ .

Nekmo commented 11 months ago

Hey! Count on the PR, I'll give my best to have it before Sunday

If you need help, just tell me :)