dfo-mar-odis / dart

DFO At-sea Reporting Template, collect Elog, CTD and sample data at sea to produce reports then upload to BioChem at the end of the mission
3 stars 1 forks source link

Biochem loading dis_data_num key conflicts BioChemUpload/BCD tables #158

Open upsonp opened 2 weeks ago

upsonp commented 2 weeks ago

The way I was previously loading BCS/BCD data worked if each mission was only responsible for its own BCS/BCD tables, but now that we're looking at directory inserting data into BCDiscreteStnEdits/BCDiscreteDataEdits tables as part of a user account on the BioChem server, there are going to be conflict issues with dis_data_num keys.

The biochem.update.get_bcd_d_rows() and biochem.update.get_bcd_p_rows() functions will have to be reworked to account for dis_data_num keys that belong to other 'batches' (consider all rows from a mission to be a 'batch')

For the moment I've reworked the get_bcd_d_rows() function so that it deletes all rows for a batch and recreates everything, but the process is slow to reload all data everytime.

I do have functionality built-in that was never developed to track upload/modified/status of datatype when it was selected for upload using the core.models.BioChemUpload model. When the user checks the box over a datatype: image

This triggers a function that enters the variable in the BioChemUpload model, but then I don't really do anything with it.

What should happen is: 1) When the user checks the box for a [datatype] 1) An entry is made in the BCUupload table, with a modified_date and a status set to 'upload' 2) When the user reloads samples for a [datatype] in the BCUpload table 1) The modified_date should be updated and status set to 'upload' 3) when the user un-checks the box 1) entries in the BCUpload table with status 'upload' and no upload_date are removed, no action is needed because the data was never uploaded 2) entries in the BCUpload table with status 'upload' and modified_date > upload_date should have status set to 'delete' 3) entries in the BCUpload table with status 'uploaded' are changed to 'delete' 4) when the user clicks the Biochem Upload button: 1) Delete from BCD

Similar process should be drafted for BCS updates. When mission or event data is changed BCS entries should be updated, once uploaded they rarely change though and it's time consuming to rebuild BCS rows when there's no reason to re-upload them.

upsonp commented 2 weeks ago

I was going to add a condition to the form_biochem_database.upload_bcd_d_data function that would check the modified_date < upload_date and force an upload of data that was reloaded.

But I think I'd like to have the parsers set the BioChemUpload status when data is loaded.

I like the idea of a consistant process where if events are loaded by core.parsers.andes or core.parsers.elog a flag will be set to indicate BCS data needs to be loaded/reloaded and then the core.parsers.PlanktonParser, SampleParser or ctd parser flagging BioChemUpload entries to be upload when data is reloaded and has already been marked as uploaded.

upsonp commented 2 weeks ago

I don't really think the BioChemUpload.type relationship is setup correctly. It's currently a many-to-many relationship, but each MissionSampleType can only have one update status.

When I was looking at updating the CTD parser to mark bottle data previously uploaded to Biochem as modified I realized the relationship works like this:

for mission_sample_type in mission.mission_sample_types.all():
    if mission_sample_type.uploads.all().exists():
        ms_type = mission_sample_type.uploads.all().first  # There's only ever one upload
        # do updates here.

When it should work like this:

for mission_sample_type in mission.mission_sample_types.all():
    if mission_sample_type.upload:
        ms_type = mission_sample_type.upload
        # do updates here.