chanzuckerberg / single-cell-data-portal

The data portal supporting the submission, exploration, and management of projects and datasets to cellxgene.
MIT License
63 stars 12 forks source link

[Discovery] - Should private collections and private revisions warn submitters / users about their state ? #6377

Open brianraymor opened 10 months ago

brianraymor commented 10 months ago

Context

See:

Private Collections and permanent links

Since its inception, CELLxGENE Discover guaranteed that the permanent (canonical) URL(s) for both a private collection and any of its related CXG(s) must not change when the collection is published. This allowed submitters to reference their CELLxGENE collection and visualization in journal articles in preparation. The expectation was that the collection must be published before the article appears. There have been recent cases where the journal article was published, while the collection was still private. There is no warning to users redirected from the journal article that the collection is private.

Perhaps it would be helpful to have a banner or watermark indicating that the collection/CXG(s) are private and have not been published - as a prompt to the submitter to change their state or contact the curators.

Private revisions and version links

With the introducing of versioning in 2023, private revisions of public collections receive their own permanent version URL(s) for both the revision and its updated CXG(s). When the revision is re-published, the update is published as a refresh of the canonical URL, although the version URL(s) are available via the Discover API for reproducibility scenarios. Data submitters must not reference the permanent version URL in publications, because it will soon be outdated by schema updates. The canonical URL(s) will always reference the latest version.

It seems like we need a caution about the version URL(s) in revisions?

CC: @hthomas-czi

jahilton commented 2 months ago

Amanda has set up an airtable that is produced on the 1st of every month and shows access to private URLs over the last month. Aug 1 was the first official production and from our evaluation of that, 13 private links have been confirmed to be shared in publications, github repos, lab sits, or other data resources.

For the banner/watermark, I'm not sure it should be aimed at the submitter - seems like maybe they submit and then forget about it. So maybe it should target the random user "Warning - unreleased data may contain errors, may be subject to access disruptions, and is not versioned. If you have come to this via a publicly-shared link, please contact the cellxgene team so they can work with the contributor to finalize their submission"