google / osv.dev

Open source vulnerability DB and triage service.
https://osv.dev
Apache License 2.0
1.54k stars 188 forks source link

fix(sourcerepo-sync): improve debugability #2700

Closed andrewpollock closed 1 month ago

andrewpollock commented 1 month ago

Problem:

+ poetry run python source_sync.py --kind SourceRepository --project oss-vdb-test --file ../../source_test.yaml --no-dry-run --verbose
Loaded 26 local source repositories
Validated 26 source repositories
Retrieved 26 source repositories from datastore
Traceback (most recent call last):
  File "/workspace/tools/sourcerepo-sync/source_sync.py", line 162, in <module>
    main()
  File "/workspace/tools/sourcerepo-sync/source_sync.py", line 134, in main
    validate_repository(ds_repos, False)
  File "/workspace/tools/sourcerepo-sync/source_sync.py", line 69, in validate_repository
    if 'link' in repo and repo['link'][-1] != '/':
                          ~~~~~~~~~~~~^^^^
TypeError: 'NoneType' object is not subscriptable

happening on https://github.com/google/osv.dev/pull/2699 was harder to get to the bottom of than I'd have liked...

Capture some of the knowledge gained as part of debugging this in some basic service documentation to help the next person

andrewpollock commented 1 month ago

I'm having some feelings about all of the functions doing the things being nested within main()... It doesn't feel like it's adhering to the spirit of the Python style guide on this?

michaelkedar commented 1 month ago

I'm having some feelings about all of the functions doing the things being nested within main()... It doesn't feel like it's adhering to the spirit of the Python style guide on this?

+1 - it looks like most of the nested functions are relatively straightforward to extract

andrewpollock commented 1 month ago

+1 - it looks like most of the nested functions are relatively straightforward to extract

Yeah I was trying to time-box the amount of time I spent on this, as I burned close to 2 hours between debugging the failure and making this PR, so I wanted to avoid a large-scale refactor (and attendant breakage) so I can unwind my stack and get back to my other task...

andrewpollock commented 1 month ago

Was this fixed? it seems like that if 'link' in repo check isn't accomplishing what it intended to do.

Yeah good point... I fixed it directly in Datastore to get the code to stop crashing... Looking over https://github.com/google/osv.dev/commits/master/source_test.yaml, it looks a bunch of recent commits haven't been applying successfully 😞

andrewpollock commented 1 month ago

Looking over https://github.com/google/osv.dev/commits/master/source_test.yaml, it looks a bunch of recent commits haven't been applying successfully 😞

Actually it's not quite so dire, it was just this most recent PR that fell over 🤔

The problem seemed to be that the cve-osv source in Datastore was quite different to the YAML reality for it...