icgc-argo / argo-clinical

Clinical data submission for ARGO programs.
GNU Affero General Public License v3.0
2 stars 0 forks source link

🐛 Donor TSV Data includes empty records for specimens #1090

Open joneubank opened 8 months ago

joneubank commented 8 months ago

Describe the bug

There is a case where donor tsv downloads have empty records in the speciment tsv. The empty record includes the donor ID and program short name fields, but no other data, making the record look like junk data.

The cause of this situation is that the program has registered more specimens than they have provided data for. As an example, if a program has registered 3 specimens (through the sample_registration screen), but has only provided the specimen data for 2 of those 3. This will result in a specimens.tsv file with 3 records, one of which is completely empty (only donor and program ids, not even the specimen ID of the missing record.

Steps To Reproduce

  1. Register a new sample for a new donor with a new specimen ID.
  2. Use the download tsv by donor endpoint requesting files for the donor you registered the specimen for
  3. The specimens.tsv will contain the empty specimen record.

Expected behaviour

The downloaded specimens.tsv file should only contain records for which the specimen data has been submitted.

Extra Details

The same code that is pulling specimen data from the database for download is also serving the Submitted Data screens, resulting in empty specimen records showing there. There is not a separate issue for this (at this time), but if the fix is applied to the code that is extracting entities from the DB then both locations should be fixed at the same time.