vipyrsec / dragonfly-mainframe

The mainframe for Dragonfly
https://docs.vipyrsec.com/dragonfly-mainframe/
MIT License
5 stars 1 forks source link

Lock rows when selecting in get job CTE #234

Closed Robin5605 closed 1 month ago

Robin5605 commented 6 months ago

Currently, it's possible that we get multiple requests to the get job endpoint that all return the same job: https://github.com/vipyrsec/dragonfly-mainframe/blob/c4749c2aa678339b831063bc56f7b34917a6bed4/src/mainframe/endpoints/job.py#L44-L58

To remedy this, we should lock the row while selecting using a FOR UPDATE SKIP LOCKED clause. This will ensure that we don't get duplicate packages returned.

We want something like this:

    cte = (
        select(Scan)
        .where(
            or_(
                Scan.status == Status.QUEUED,
                and_(
                    Scan.pending_at < datetime.now(timezone.utc) - timedelta(seconds=mainframe_settings.job_timeout),
                    Scan.status == Status.PENDING,
                ),
            )
        )
        .limit(batch)
        .options(joinedload(Scan.download_urls))
+       .with_for_update(skip_locked=True)
        .cte()
    )