reacherhq / check-if-email-exists

Check if an email address exists without sending any email, written in Rust. Comes with a ⚙️ HTTP backend.
https://reacher.email
Other
4.23k stars 327 forks source link

Bulk api doesn't seem to work #1411

Closed namper closed 7 months ago

namper commented 7 months ago

Email to check

No response

From where did you run check-if-email-exists?

Local, AWS EC2

Version of check-if-email-exists (if running it yourself)

0.9.1

What happened?

Bulk job is in running state forever. In logs, task UUIDs are returned successfully, However /v0/bulk/7 status is running for quite long time.

{
    "job_id": 7,
    "created_at": "2023-12-18T18:35:00.624988Z",
    "finished_at": null,
    "total_records": 4,
    "total_processed": 0,
    "summary": {
        "total_safe": 0,
        "total_risky": 0,
        "total_invalid": 0,
        "total_unknown": 0
    },
    "job_status": "Running"
}

Relevant log output

2023-12-18T18:35:00.632474Z DEBUG reacher: Submitted task to sqlxmq for [job=7] with [uuid=dad098a4-b001-4a15-a8d4-1ce59879728b]
2023-12-18T18:35:00.646007Z DEBUG reacher: Submitted task to sqlxmq for [job=7] with [uuid=28ae41f6-3a85-4297-9c54-601e7c1a752c]
2023-12-18T18:35:00.647670Z DEBUG reacher: Submitted task to sqlxmq for [job=7] with [uuid=aa8ad6f5-18c7-47d9-a121-c92c329d11ae]
2023-12-18T18:35:00.649214Z DEBUG reacher: Submitted task to sqlxmq for [job=7] with [uuid=9d91cd8e-6ef5-4733-bfa5-ccc50e586eee]
2023-12-18T18:35:00.649306Z  INFO reacher: 192.168.176.1:63028 "POST /v0/bulk HTTP/1.1" 200 "-" "insomnia/8.4.5" 25.928083ms

I have Postgres and RabbitMQ running, data is saved to Postgres. Did I miss a setup step for this ?

amaury1093 commented 7 months ago

Hey! There are currently 2 ways to do bulk verification:

  1. Use the /v0/bulk endpoints as you did. This bulk will send the jobs to the Postgres DB. Apparently it's bugged, as you reported.
  2. Use RabbitMQ. I've been testing this for 1 month. You don't need postgres with this one.

I'm planning to remove support for 1. There's a too big dependency on https://github.com/Diggsey/sqlxmq, whereas RabbitMQ is battle-tested and performant (and running on Reacher in production for 1 month).

As such, I'll mark this issue as wont-fix, and work instead on adding docs to the RabbitMQ-based bulk.

For now, if you want to try RabbitMQ-based bulk, you'll need to mostly play around yourself:

Not super friendly for external contributors for now, but I'll work on adding docs in the next few weeks.

namper commented 7 months ago

Thanks for the response, It would be great to have some client friendly bulk api in future. I am not expert in rust/warp but happy to contribute.