Open fevac opened 8 months ago
YES! YES! YES! There is only one deficit to this, and that is cram files. Assuming that at the time when we created the bundle all files were present then this should work. However, we would not know if we were missing cram/bam
- which could perhaps be fixed with an error and some kind of force flag. We could for instance ask the customer if they are okay with skipping those files. I am uncertain if the deliver
tag has been used to tag things this way forever, but either way, I think it should be for today and onwards - it would increase backwards compatibility by miles.
Great suggestion Eva!
we could also make sure that all files that need delivery get the delivery
tag. So all cram files should have it in the future
So, this is the systemd service that removes cram:
[Service]
Type=oneshot
ExecStart=/bin/bash -c "/home/proj/production/bin/miniconda3/envs/P_cg/bin/cg \
--config /home/proj/production/servers/config/hasta.scilifelab.se/cg.yaml \
clean \
scout-finished-cases \
-y \
--days-old 300"
ExecStartPost=/bin/bash -c "systemctl --user start send-success-slack@%n.service"
Which does this:
I used the following query to check which pipelines use the deliver
and delivey-report
:
SELECT
tag.name AS tag_name,
order.workflow AS order_workflow,
COUNT(DISTINCT file.id) AS file_count
FROM
`housekeeper-stage`.bundle
INNER JOIN
`housekeeper-stage`.version ON bundle.id = version.bundle_id
INNER JOIN
`housekeeper-stage`.file ON version.id = file.version_id
INNER JOIN
`housekeeper-stage`.file_tag_link ON file.id = file_tag_link.file_id
INNER JOIN
`housekeeper-stage`.tag ON tag.id = file_tag_link.tag_id
LEFT JOIN
`cg-stage`.sample ON bundle.name = sample.internal_id
LEFT JOIN
`cg-stage`.case ON bundle.name = case.internal_id
LEFT JOIN
`cg-stage`.case_sample ON case_sample.case_id = case.id AND case_sample.sample_id = sample.id
LEFT JOIN
`cg-stage`.order_case ON order_case.case_id = case.id
LEFT JOIN
`cg-stage`.order ON order.id = order_case.order_id
LEFT JOIN
`cg-stage`.analysis ON analysis.case_id = case.id
WHERE
(tag.name="deliver" OR tag.name="delivery-report")
GROUP BY
tag.name,
order.workflow
ORDER BY
tag_name DESC,
file_count DESC;
If this query is correct 😄 the chart shows that balsamic and mutant workflows are the top users of these tags (from the database on the stage server)
From a high level examination of the code, apparently, balsamic, rnafusion, and mip-rna utilize the delivery-report
tag in their configuration builders
Description
deliver
tag in housekeeper instead of the current deliver analyses. This doesn't need to be the default behaviour but an optional one instead.cg deliver analysis
Suggested solution
deliver
anddelivery-report
tag in HousekeeperThis can be closed when
Describe what needs to be done for this issue to be closed
Blocked by
If there are any blocking issues/prs/things in this or other repos. Please link to them.
Clarification
The cg deliver code works for delivering the files from the most recent analyses. For older analyses where the tags and files might be different and thus not delivered properly.
Acceptance criteria