mediacloud / backend

Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online media.
http://www.mediacloud.org
GNU Affero General Public License v3.0
281 stars 87 forks source link

add topics/cancel end point #552

Open hroberts opened 5 years ago

hroberts commented 5 years ago

add topics/cancel end point to end any currently running spider or snapshot jobs for a topic.

cindyloo commented 4 years ago

I think this is essentially done - new versions replace old versions, effectively cancelling them. We don't have an api endpoint however

hroberts commented 4 years ago

This is not really true. The idea of the cancel end point is that it would actually stop whatever topic spidering or snapshotting jobs are happening on the back end. Even if you create a new version, the system will continue trying to run the old version.

On Wed, Apr 15, 2020 at 8:10 AM cindyb notifications@github.com wrote:

I think this is essentially done - new versions replace old versions, effectively cancelling them. We don't have an api endpoint however

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_berkmancenter_mediacloud_issues_552-23issuecomment-2D614030439&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=0c5FW2CrwCh84ocLICzUHjcwKK-QMUDy4RRw_n18mMo&m=FEdmkWjX1Bn_e11HLXgh20ZTcJg0WJXKRbOX6_pCM4A&s=XDzCuBC5tqoeAfaRd4r3K1b9zH5wak260ZCS0i13qcM&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAN66T35ULLH3FONUCIZILTRMWW4BANCNFSM4G2J4AMA&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=0c5FW2CrwCh84ocLICzUHjcwKK-QMUDy4RRw_n18mMo&m=FEdmkWjX1Bn_e11HLXgh20ZTcJg0WJXKRbOX6_pCM4A&s=DZXBtS0FNACf6vPx3HVdwgI0z_ibCENq21pLicPElME&e= .

cindyloo commented 4 years ago

ok, I think we were addressing two concerns. 1) as far as the researchers are concerned, they wanted a way to in essence 'stop/cancel/start over' with a topic. Versioning does take care of that. 2) a way to stop the spidering/snapshotting jobs. What I didn't know is that with a new version, we are still running back end jobs from previous versions... Maybe that is what should be addressed, and surfacing that to the admin user

hroberts commented 4 years ago

I still don't think this is true. spidering and snapshotting jobs both have locks that will prevent more than one job running at a time for a given topic. so if the front creates a new version and starts the spider/snapshot job, then creates a second new version and starts a spider/snapshot job for the second one, the second job will fail if the first one is still running.

-hal

On Wed, Apr 15, 2020 at 9:34 AM cindyb notifications@github.com wrote:

ok, I think we were addressing two concerns. 1) as far as the researchers are concerned, they wanted a way to stop/cancel/start over with a topic. Versioning does take care of that. 2) a way to stop the spidering/snapshotting jobs. What I didn't know is that with a new version, we are still running back end jobs from previous versions... Maybe that is what should be addressed, and surfacing that to the admin user

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_berkmancenter_mediacloud_issues_552-23issuecomment-2D614076357&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=0c5FW2CrwCh84ocLICzUHjcwKK-QMUDy4RRw_n18mMo&m=xJO1TSt4hWDegKNJKZgzmEDKwUR6MGnDp0tRBwGlsms&s=C4-U7fC4xJko3yKDjfGPQql4XDKzjtiT7lpYnBArXCM&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAN66TZSEOXO575KRBSRULDRMXAWBANCNFSM4G2J4AMA&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=0c5FW2CrwCh84ocLICzUHjcwKK-QMUDy4RRw_n18mMo&m=xJO1TSt4hWDegKNJKZgzmEDKwUR6MGnDp0tRBwGlsms&s=15LZ7E36vJCbpuqhD768_ADHjAU6A8Si70J-IgQ746Y&e= .

cindyloo commented 4 years ago

gotcha. so if a version is completed, no jobs are running on the back end correct? Creating a new version spiders/creates a new snapshot with modified info. Older versions should be described as ___ (not cancelled)?

if a version isn't completed/still running/spidering, in the UI, we allow admins to create new version but not regular users. If a new version is created while a previous one is running, the newer one will fail in all cases (for admins and users).

so then surfacing a cancellation is still needed for admins at least. and none of this is particularly obvious to the front-end users but it doesn't matter as long as they can use the topic version as explained.

hroberts commented 4 years ago

yes, once a version is completed, nothing is running on the back end. older versions should just be described as whatever their state is (queued, running, completed, etc).

this should all be a pretty rare use case, and the bad behavior when the lock gets hit is that the new version just fails with an error that says something like 'failed to get lock for snapshot for topic 3290', so I don't think we should worry about the edge case unless folks start hitting it. I just don't want you to build the ui to affirmatively give the user the notion that she can cancel a running job.

-hal

On Wed, Apr 15, 2020 at 12:00 PM cindyb notifications@github.com wrote:

gotcha. so if a version is completed, no jobs are running on the back end correct? Creating a new version spiders/creates a new snapshot with modified info. Older versions should be described as ___ (not cancelled)?

if a version isn't completed/still running/spidering, in the UI, we allow admins to create new version but not regular users. If a new version is created, the newer one will fail in all cases (for admins and users).

so then surfacing a cancellation is still needed for admins at least. none of this is particularly obvious to the front-end users.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_berkmancenter_mediacloud_issues_552-23issuecomment-2D614159067&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=0c5FW2CrwCh84ocLICzUHjcwKK-QMUDy4RRw_n18mMo&m=-CCCvnzNed9_j_bst1a9USj9e2W2itUBR3PvuBsY-KE&s=trT_XytPY7scBIBvPUADtkcHXkYunQWnIZ-6w2Z6Kww&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAN66T5YAHLAJADKTUPMXH3RMXRY7ANCNFSM4G2J4AMA&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=0c5FW2CrwCh84ocLICzUHjcwKK-QMUDy4RRw_n18mMo&m=-CCCvnzNed9_j_bst1a9USj9e2W2itUBR3PvuBsY-KE&s=8w4_94tBzclp3Lvph9ZjoPwy0o9AI_4DGeZtNivQLxA&e= .