chanzuckerberg / single-cell-curation

Code and documentation for the curation of cellxgene datasets
MIT License
37 stars 23 forks source link

Datasets with incorrect `is_primary_data=True` #528

Open pablo-gar opened 1 year ago

pablo-gar commented 1 year ago

From @jahilton

@bkmartinjr surfaced observations in the Census that are is_primary_data:True & have matching count data with another observation that is_primary_data:True.

jahilton commented 1 year ago

The final 2 datasets to investigate/correct are from the same group. We have reached out regarding the more significant one (4,886 obs) and they are investigating. Once a path forward is determined, we will surface the other (8 obs)

brianraymor commented 1 year ago

@jahilton - is there any update for the final 2 datasets ?

jahilton commented 1 year ago

we did inform the group of the second dataset. they are investigating both

brianraymor commented 1 year ago

@jahilton - is there any update for the final 2 datasets ?

jahilton commented 11 months ago

Heard from the contributor linked to a48f5033-3438-4550-8574-cdff3263fdfd (case with 8 duplicated obs)

maybe there are some errors in transferring the sample labels when we merge the datasets for each sample, which also results in duplicated cells. We are now working on mapping those messed-up cells to their correct sample labels, which would inform us where the duplicated cells are coming from

jahilton commented 8 months ago

Update on a48f5033-3438-4550-8574-cdff3263fdfd - they provided updated files that no longer have duplicated obs data, but there were other issues in the metadata that are being ironed out

jahilton commented 7 months ago

Update on a48f5033-3438-4550-8574-cdff3263fdfd - revised & Published (cc @pablo-gar )

jahilton commented 5 months ago

Still no word from the contributors of our last case on if they have found the solution. We reached out to them again today to check in.

jahilton commented 1 week ago

We have reached out again to the contributors of our last case. A deadline of Dec 12 has been set to resolve the issue or the Collection will be deleted.