BiologicalRecordsCentre / iRecord

Repository to store and track enhancements, issues and tasks regarding the iRecord website.
http://irecord.org.uk
2 stars 1 forks source link

Enable storage of unshared data #1396

Open kitenetter opened 1 year ago

kitenetter commented 1 year ago

The requirement is for verifiers to be able to upload records that cannot be shared other than for verification. Use cases:

Suggested approach:

  1. Set up a non-shared 'website' within the data warehouse
  2. Enable the iRecord upload tools to be directed into this private database - only verifiers should be able to access this
  3. Provide a tool for verifiers to transfer records from the 'private' dataset to the 'open' dataset
  4. Will need to provide at least two surveys, a clone of the iRecord Import survey, and a clone of the iRecords moths survey
  5. Enable metadata to be stored that indicates why the data is unable to be shared (not sure if this should be a survey attribute or metadata associated directly with the import)

Additional item (that may need to be transferred to a separate issue):

  1. Ideally, provide a tool for enabling records to be transferred from the verifier's account to another user's account (e.g. if a verifier uploads records from another recorder, and that recorder subsequently registers on iRecord and wants 'ownership' of their own records).
johnvanbreda commented 1 year ago

A few thoughts and questions @kitenetter.

  1. Is there a requirement to move existing public data from iRecord back to the private unshared website, or is this a one way tool? There are implications in terms of data that have already got out into the wild and also the fact that many surveys will contain data that cannot be moved to the private area (because only limited survey datasets are available in the private area), so a one-way tool would be simpler.
  2. We can provide a special version of the import tool for private records and control access to it via a Drupal role.
  3. We can provide a "Explore private data area" records tool, which might be something like the Explore tool. Presumably limited to my records and controlled by the same Drupal role.
  4. Presumably we need to be able to filter to an import, so a tool for listing the private data area imports and loading their records into the private data explore page.
  5. That means we need to add the import GUID field into Elasticsearch so it can be filtered on.
  6. The private explore tool then needs a button to move all the currently filtered records, or a selection, to the public area. It should just need to alter the website ID and survey ID, everything else stays the same. It may be best to do this immediately (with a progress bar) otherwise they will get confused with records appearing that have already moved. Or at least to temporarily disable the records in Elasticsearch so they disappear until they are properly re-indexed.
  7. The new importer already has the ability to store a comment for each import. Is this good enough for capturing the metadata about why the records are kept private?
kitenetter commented 1 year ago
  1. I agree that there are good reasons not to move existing public data into private, and I am not aware that this has been requested, so no requirement for this.
  2. Sounds good. We may want to make this functionality available to all verifiers, so may not need a new Drupal role - I will review that.
  3. Sounds good.
  4. I don't think this has been discussed but it sounds like a good idea.
  5. Yes, which I guess would also give us the option to filter by import for public data as well, although that will need to be factored in to the work on making filtering easier to understand.
  6. That sounds correct, but see comment 8 below re verification status.
  7. Probably good enough, may need to prompt people as to what information we are expecting in that comment. What happens to the stored comment if the records are moved to public? Will the stored comment also become public? we may need a way of preventing the stored comment from becoming public, or at least ensure that any such comment is reviewed/edited before becoming public.

Additional comments

  1. Verifiers need to be able to verify the records that they have uploaded into the private area, so ideally we need a way of enabling a verifier to bring private records into their verification grid. And then if the records do subsequently get moved to public the verification decision should stay with them.
  2. Verifiers need to be able to download their private data; not sure if this needs a separate download process or if it becomes an option that is added to the standard download page (if the latter then it is an option that will need to be linked to whichever role we use to make the private data functions available to users).
johnvanbreda commented 1 year ago
  1. The import comments haven't been used anywhere yet so are not visible as part of a record. They are metadata about the import itself, so won't normally be seen along with the record (unless we change that of course).
  2. Agreed - I think the private data area will be shared for verification purposes only. Moving them won't affect the record status.
  3. It will be available in their verification filter downloads.

OK, I'll proceed on that basis then, with the proviso that you might yet review where the metadata about reasons for keeping the records private will go.

johnvanbreda commented 1 year ago

Coding for recordsMover control now done.

johnvanbreda commented 1 year ago

Noting work in progress - I've configured a new website registration on the warehouse, ID 152 (iRecord unshared data). This has a survey datatset, ID 719 (iRecord unshared imports) which is a clone of the iRecord import survey dataset so that records can be freely moved between the 2 datasets.

Website ID 152 is configured to allow verification but not public reporting on iRecord.

On both live and test there are now the following pages, all accessible to verifiers only but not yet in the menu:

Once the code for the recordsMover component is released it can be added to the explore unshared records page which completes the functionality required for this task.

johnvanbreda commented 1 year ago

The records mover button (Share records button) is now available for testing at https://test-brc-irecord.pantheonsite.io/explore/my-unshared.

kitenetter commented 1 year ago

Can we add a clone of survey ID 90, "iRecord Moths", to the unshared data website. I don't think we need any others in the short-term, although it may be that other recording schemes will request additional options in future.

kitenetter commented 1 year ago

Have just done a test upload into the unshared data website. All seemed to go well.

On page /explore/my-unshared the "View record details" button currently leads to a "Page not found" result, but I assume the plan is to replace this with the "/record-details/verification" option on the button?

Arguably we could disable the "View species details" button, since that page won't include the unshared records, but I guess it could still provide useful context so probably best to keep this one.

I think we need to add an "Edit this record" button on the Explore page (unless a different approach to editing is needed for the unshared records).

kitenetter commented 1 year ago

When I did a download of records using my verification role, the unshared records were not included in the download - not sure if this is part of the process that hasn't yet been implemented.

kitenetter commented 1 year ago

Questions regarding the records mover button:

johnvanbreda commented 1 year ago

The records mover button has a configuration containing a list of source website/survey IDs and their matching destination website/survey IDs so that the records can only go from the 2 surveys in the unshared website to their matching surveys in the iRecord website - so there is no choice and no chance the user can move the records to an incompatible survey.

Agree about the last point - how do you see this working?

kitenetter commented 1 year ago

Don't know how feasible this is, but suggest:

  1. Click on button to move records
  2. If an entire sample or set of samples is being moved, proceed as normal
  3. If the record/s being moved do not form complete samples, pop up a message to say "Record/s form part of a larger sample and cannot be moved in isolation". Provide another button that says "View sample/s for record/s", which opens up a grid of the linked samples and allows individual samples to be chosen and moved.

Only other alternative I can think of is to design a new grid for the "Explore my unshared" page, which is structured to show the sample-occurrence hierarchy, and perhaps only allows complete samples to be filtered and selected for moving?

Something along these lines would definitely be desirable, but I don't think it stops us making the new features available as they stand (one we have guidance in place).

johnvanbreda commented 1 year ago

@kitenetter offering an alternate view that lists all the samples for the selected records, then allows you to expand and view the occurrences sounds feasible. Or this alternative view could just be a reload of the same occurrences explore page, but filtered to show all records from the selected samples. The difficulty here would be that the filter would have to work by passing a list of sample IDs to the Elasticsearch query. We'd need to do some tests on the upper likely limit as the list of sample IDs - do you have a feeling for the largest likely number of records in a transfer?

kitenetter commented 1 year ago

I don't really have a feeling for the scale of this. It's not yet clear how many verifiers will actually want to use a non-shared area, and of those that do it is very hard to know how many records they might upload, and how many samples that will translate to. Sorry!

johnvanbreda commented 1 year ago

I wonder if implementing a potentially tricky samples based solution is best avoided until we have a clearer picture of how this is going to be used. The current solution does work as it is, so maybe the documentation can explain that the records move can only work if the samples are complete. If this turns out to be a regular issue, then we can write code which shows all offending samples in a grid with the ability to expand to view the occurrences, and move all samples across, or a single selected sample.

johnvanbreda commented 10 months ago

@kitenetter any thoughts on where to go with this?