as a pixl pipeline user, I don't want to have to wait 5 minutes for 30,000 images to be checked if they already exist and if they have already been exported
[ ] follow similar approach to pixl export-patient-data to query once for all images in an extract if the extract hasn't been created (otherwise we know we have to make a new image for each). Then don't send messages for studies that have already been exported. It may be easier to keep both as dataframes before converting into messages
[ ] check to see if there's a speed up of adding all new Image entities at once, or whether sqlachemy does what you'd expect and commits in a single batch
Testing
Imagine should be fairly obvious but could do some profiling with timeit
Definition of Done / Acceptance Criteria
as a pixl pipeline user, I don't want to have to wait 5 minutes for 30,000 images to be checked if they already exist and if they have already been exported
pixl export-patient-data
to query once for all images in an extract if the extract hasn't been created (otherwise we know we have to make a new image for each). Then don't send messages for studies that have already been exported. It may be easier to keep both as dataframes before converting into messagesImage
entities at once, or whether sqlachemy does what you'd expect and commits in a single batchTesting
Imagine should be fairly obvious but could do some profiling with timeit
Documentation
No response
Dependencies
No response
Details and Comments
No response