Description of changes

Added a filtering step to SequencingObjectProcessingService such that files that are on a sequencing run that is not in a COMPLETE state are not picked up for processing.

Related issue

Link to the GitHub issue this pull request addresses using the #issuenum format. If it completes an issue, use Fixes #issuenum to automatically close the issue. Fixes #1505

Fixes the race condition issue of files having FastQC run on them before they are fully uploaded.

How to test changes

run irida
In IRIDA, create a new project and a new sample in that project. Make note of the Project_ID and Sample Name

checkout, build and import the iridauploader codebase into a python3 interpreter

cd irida-uploader
git pull origin development
make
source .virtualenv/bin/activate
python3

Use the libraries to make a sequencing run, and upload a file


import iridauploader
from iridauploader import api
# make an api instance of IRIDA
# if you built irida with dev db seed, the following creds should work
a = api.ApiCalls("sequencer", "N9Ywc6GKWWZotzsJGutj3BZXJDRn65fXJqjrk29yTjI", "http://localhost:8080/api/", "admin","Password1!")
# test connection
a.get_irida_version()

make a sequencing file

a valid file which will pass FastQC can be found in the irida-uploader source. irida-uploader/examples/directory_run/file_1.fastq.gz

sf = iridauploader.model.SequenceFile(['/path/to/a/fastq.gz/file/mysample.fastq.gz'])

create a new sequencing run in IRIDA

run_id = a.create_seq_run(metadata={'layoutType': 'SINGLE_END'}, sequencing_run_type='miseq')

Go to the IRIDA ADMIN panel to view your sequencing run http://localhost:8080/admin/sequencing-runs

Use your project id and sample name from before

p_id = 1 # Note: this should be an int s_name = 'my_sample' # Note: this should be a string

upload the data

a.send_sequence_files(sf, s_name, p_id, run_id)

response should look something like this

{'resource': {'file': '/tmp/irida/sequence-files/45/1/valid.fastq.gz', 'createdDate': 1702507892000, 'modifiedDate': 1702507892000, 'uploadSha256': None, 'fileName': 'valid.fastq.gz', 'label': 'valid.fastq.gz', 'fileSizeBytes': 864, 'links': [{'rel': 'sample/sequenceFiles', 'href': 'http://localhost:8080/api/samples/140/sequenceFiles'}, {'rel': 'self', 'href': 'http://localhost:8080/api/samples/140/unpaired/24/files/45'}, {'rel': 'sample', 'href': 'http://localhost:8080/api/samples/140'}, {'rel': 'sequenceFile/sequencingObject', 'href': 'http://localhost:8080/api/samples/140/unpaired/24'}], 'identifier': '45'}}

5. On IRIDA, see that the sequencing run is still in UPLOADING state
6. On IRIDA, wait a few minutes and see that the file has not been processed by FastQC
7. Via the python interpreter, set the sequencing run to COMPLETE
```python
a.set_seq_run_complete(run_id)

On IRIDA, see that the sequencing run is in COMPLETE state
On IRIDA, assuming your dev environment is set up correctly, see that FastQC has run on the sample. (Can be seen in the dev output logs too)
Upload a file to a sample via the Web GUI. See that FastQC runs on it, as there is no associated Sequencing Run

Checklist

Things for the developer to confirm they've done before the PR should be accepted:

[x] CHANGELOG.md (and UPGRADING.md if necessary) updated with information for new change.
[x] Tests added (or description of how to test) for any new features.
[ ] User documentation updated for UI or technical changes.

phac-nml / irida

Fix file processor running on files that are on unfinished upload runs. #1506