the-blue-alliance / the-blue-alliance

A webapp for accessing information about the FIRST Robotics Competition.
https://www.thebluealliance.com
MIT License
394 stars 178 forks source link

Chief Delphi Migration Tracking Issue #2361

Open bdaroz opened 5 years ago

bdaroz commented 5 years ago

Chief Delphi is currently migrating from vBulletin to Discourse.

This does break all existing cdphotothread team media objects, but we did download and back the files up prior to the migration.

When the migration is complete we will need to:

phil-lopreiato commented 5 years ago

Hotfixed CD media to link to archive.org images in https://github.com/the-blue-alliance/the-blue-alliance/pull/2364 until we figure out a longer term solution (I bet the apps are still broken though)

JonathanLindsey commented 4 years ago

Any progress on this @bdaroz?

jaredhasenklein commented 4 years ago

Looks like this is getting some new attention from this CD thread: https://www.chiefdelphi.com/t/posting-a-robot-image-on-chiefdelphi-and-linking-to-it-from-thebluealliance/388148

A few observations I noticed:

bdaroz commented 4 years ago

I believe there was a conversation in slack at one point that because "new" CD Media are essentially regular posts with attachments (which can have 0 or more items, 0 or more of which can be images) and those attachments can be either in-line, or attached, or both, supporting "new" CD Media going forward was overly problematic.

jaredhasenklein commented 4 years ago

What about a different approach? I think most users are capable of getting the direct image URL pretty easily these days (e.g. right click, copy image address). We could accept image files ending with .jpg, .png, and whatever else CD supports if the URL begins with chiefdelphi.com since we know those URLs are stable.

ZachOrr commented 4 years ago

I didn’t think this was blocked? @bdaroz had the mappings between the old -> new CD URLs and this just fell through? I haven’t seen code for any attempts at this anywhere either. Maybe I missed the conversation where we decided it was too much work.

It’d be a shame to have collected years worth of CD Media to say it’s too much work to support after the forum migrated.

bdaroz commented 4 years ago

We had done the mappings, but supporting the new forum format for the multi-image posts was not anywhere near a simple regexp replacement.

bdaroz commented 2 years ago

Coming back around to this issue. Apologies if it got dropped.

The issue at the time with the CD vBulletin->Discourse migration was two fold: 1. The old links were dead, 2. New links made it very difficult to find the attachment image as opposed to a poster's avatar image.

While we did have a way to map old threads to new threads, the parsing testing that was done with the new threads left much to be desired and this issue back burnered. (Admittedly way, way back burnered).

In revisiting this now there appears to be a way to far more reliably find the intended "attached" image in the thread. This seems to work both on new threads, and old threads, but only for the first attached image in the thread. There now exists in the return HTML from the thread an HTML meta property with the name og:image with content that contains a direct URL to the first attachment.

This doesn't solve multi-image threads, but, perhaps there is another way. My current line of thinking is this - and this is what I would propose we implement soon(TM) to cover new CD media, using new media keys.

Once support is reestablished for /t/... URLs we can use that foundation to migrate the old media. I'm attaching the mapping file for that next step when we're ready.

media.csv From my notes, the first column is the cdmphoto or cdmpaper ID that we are storing, the 2nd column is a link to the new thread ID {ID} used in the form of https://www.chiefdelphi.com/t/{ID}.