Closed harshad16 closed 2 years ago
/triage accepted /sig stack-guidance /priority critical-urgent
As there was no distinction or uniqueness added in the table https://github.com/thoth-station/storages/blob/a95f2618ee94f0e519ffa93f5f6c93a9cdba0658/thoth/storages/graph/models.py#L99 the duplication was happening. By making a unique index constriant between license_name, license_identifier, license_version, we resolved the issue. https://github.com/thoth-station/storages/commit/b357f951ec5633facf8e498af5339b108b5b9d49
The database is updated with this constraint with the latest release of storage #2712
Also the data is updated , to fix the issue in already existing data:
update the foreign key:
update python_package_version a
set package_license = b.newID
from (select c.id, d.reqID as newID
from python_package_license c
inner join (select license_name, license_identifier, license_version, min(id) as reqID
from python_package_license
group by license_name, license_identifier, license_version) as d
on c.license_name = d.license_name and c.license_identifier=d.license_identifier and c.license_version = d.license_version) as b
where a.package_license = b.id;
delete the existing duplicate data:
delete
from python_package_license
where id not in ( select min(id) as id
from python_package_license
group by license_name, license_identifier, license_version);
closing the issue.
Bug description
While working #2703, we found out that python_package_license is having duplicate values.
Steps to Reproduce
Steps to reproduce the behavior:
Expected behavior
No duplicate values.