openSUSE / cavil

The legal review and SBOM system used by SUSE and openSUSE
GNU General Public License v2.0
35 stars 6 forks source link

Error-0:Yv6G - pocl empty on checkout works on reimport #62

Closed lkocman closed 1 year ago

lkocman commented 1 year ago

Context: Leap has backlog of 60~+ requests not reviewed for over 8 days.

This particular issue was identified by Sebastian Riedl That there were two requests for pocl in the backlog, and one has an Error-0:Yv6G, which means it was empty when checked out from OBS.

legal report should not be empty judging by

$ curl https://api.opensuse.org/public/source/science/pocl?rev=05f1e68e1f6817c7e6c5391f8eac871e
<directory name="pocl" rev="05f1e68e1f6817c7e6c5391f8eac871e" srcmd5="05f1e68e1f6817c7e6c5391f8eac871e">
  <linkinfo project="openSUSE:Factory" package="pocl" srcmd5="c566cacae98e515d0e0c93647593f951" baserev="c566cacae98e515d0e0c93647593f951" lsrcmd5="c492a9be0daa12e8a8e737d7959356f9"/>
  <entry name="link_against_libclang-cpp_so.patch" md5="fb3145931e75c3a11f764f22e68425cf" size="553" mtime="1608907150"/>
  <entry name="pocl-3.0.tar.gz" md5="bd79db59fa31e38759296849291210a3" size="1722809" mtime="1662482194"/>
  <entry name="pocl-rpmlintrc" md5="a8031c13cb3a4cb232bed0fd7f42dd4e" size="45" mtime="1662487601"/>
  <entry name="pocl.changes" md5="0c2660588be38db939d7f04cf1c5ec7b" size="16917" mtime="1667384010"/>
  <entry name="pocl.spec" md5="67906134b152d59a355044292367b6ce" size="4377" mtime="1667388077"/>
</directory>

Works fine on manual reimport

kraih commented 1 year ago

One pattern i noticed in the data is that there's always more than one report in the backlog for the same package. And it's always the older one that has the broken report. To me this suggests that maybe something with the cleanup code for obsolete packages might be going wrong. Perhaps the checkout gets deleted and the weekly reindexing then creates an empty report. After all we are dealing with weeks and months in the backlog here. 🤔

kraih commented 1 year ago

13 reports got corrupted again over the weekend. So now i have a pretty good picture of what is happening there. The cleanup deletes the checkouts for some reason, and then the automatic weekly reindexing can't create new reports afterwards. Leaving them in a corrupted state.

   id   | state |        created
--------+-------+------------------------
 334879 | new   | 2022-12-14 15:04:31+01
 332536 | new   | 2022-11-27 13:31:19+01
 332505 | new   | 2022-11-27 13:30:43+01
 333060 | new   | 2022-12-02 01:07:35+01
 331283 | new   | 2022-11-12 15:27:18+01
 334861 | new   | 2022-12-14 12:28:46+01
 332299 | new   | 2022-11-24 14:02:06+01
 331221 | new   | 2022-11-12 10:49:55+01
 332669 | new   | 2022-11-29 06:09:14+01
 330878 | new   | 2022-11-09 14:10:36+01
 333786 | new   | 2022-12-07 17:09:28+01
 337933 | new   | 2023-01-16 11:11:31+01
 331247 | new   | 2022-11-12 10:50:29+01
(13 rows)
kraih commented 1 year ago

This turned out to be a rather complex situation that ultimately goes back to the introduction of automatic accepts for factory requests. Life cycle of the package goes something like this:

  1. New factory request
  2. Factory request submitted to legaldb (obs#123)
  3. Auto approved after a few hours
  4. Factory request removed from legaldb
  5. Request reimported via product (openSUSE:Factory)
  6. Product request is not reviewed and remains in backlog
  7. Package submitted from Factory to Leap
  8. Leap request submitted to legaldb (review request already exists for openSUSE:Factory)
  9. New factory request for next version of package
  10. Factory request submitted to legaldb (obs#124)
  11. Auto approved after a few hours too
  12. Factory request removed from legaldb too
  13. Request reimported via product too (openSUSE:Factory)
  14. The legaldb now detects duplicate review requests for the same package in the openSUSE:Factory product and flags the older one as obsolete
  15. Here is a race condition, one of two things happens a) OBS bot notices and quickly removes the obsolete flag, b) The cleanup process triggers and removes the source checkout from disk
  16. Over the weekend all legal reports are recreated, and depending on if the checkout had been removed our report is now corrupted

The system depends a bit too much on the external_link field for duplicate detection, but it's the best we got for now. I'm now testing a possible fix that will allow the external_link to be changed from openSUSE:Factory to obs#... during step 8. That should theoretically prevent the race condition.

kraih commented 1 year ago

It seems the applied fix has resolved the issue.