Closed eatyourgreens closed 4 years ago
I forgot https://talk.cyclonecenter.org/recent, which is the project that used Groups in Talk.
Chicago Wildlife Watch was the project that removed subjects for privacy reasons. It would be useful to know if those subjects are still hidden in the archived version. A number of subjects do come up as 404 errors during the build.
https://talk.condorwatch.org/recent :tada: Approved - Cam Reviewed 24 July 2020
Making each of these live is, I think, a case of renaming /index.html
so it isn’t lost, then making a copy of /recents/index.html
at /index.html
. The static cache may need to be cleared too, to make the changes visible. Jenkins has a job that will do that.
yeah - good thinking Jim. I'll add this note here and we can add a script or docs on the readme if needed. E.g. to enable the static site version for https://talk.condorwatch.org/
DEPLOY_PATH="s3://zooniverse-static/talk.condorwatch.org/"
# preseve the original index file for rollback / posterity
#
# check the cmd looks right via --dryrun flag
aws s3 cp --dryrun "${DEPLOY_PATH}index.html" "${DEPLOY_PATH}old_index_`date '+%Y-%m-%d-%H:%M:%S'`.html"
# run the backup (no --dryrun)
aws s3 cp "${DEPLOY_PATH}index.html" "${DEPLOY_PATH}old_index_`date '+%Y-%m-%d-%H:%M:%S'`.html"
# enable the static version of the site - overwrite the old index file
aws s3 cp "${DEPLOY_PATH}recent/index.html" "${DEPLOY_PATH}index.html"
Minor notes on Wormwatch Lab:
Besides that, all other content is appear, so happy to move this over to reviewed if this is not an issue.
That's a good catch. Old Talk has 318 tagged subjects and 3 tagged discussions. The new pages only have the tagged subjects.
EDIT: the tagged discussions all have the hashtag on the new pages, so I think that's fine. eg. https://talk.wormwatchlab.org/boards/BWS0000003/discussions/DWS00001hu/
Reposting here from the slack thread: Notes from Nature review notes — https://docs.google.com/document/d/1mYNeYzyGMz53BVBzMbYtNUaqxO85_Fy_d6xXKtSFGLA/edit?usp=sharing. Same questions as Shaun in his ‘of note’ section in Slack, plus a few minor additions (in red).
No missing / incorrect content. And based on Slack thread - none of these additional minor/cosmetic questions I've flagged are reason to rerun or change the process.
So, happy to have Notes from Nature moved over to reviewed.
NfN has broken images. From the collection that Laura linked too: https://talk.notesfromnature.org/collections/CNNL000006/
It's interesting - that happened to me at first, but then I refreshed the page, and all the images fully loaded.
Yeah, right click and open image in a new tab loads too eg.
Maybe the page is timing out, or our thumbnail service is timing out.
Clicking through to each subject shows an image too, so I'm not worried about missing images now.
Cyclone Center Talk generally passed the review checklist except it is missing the group discussions as noted in issue #67 and the search results for the tags are inconsistent. Here are a couple of examples:
eye-storm
pinhole-eye
I am wondering if this inconsistency is because of the missing groups?
Milky Way Project audit generally good 👍 .
a. Seen similar comments in other reviews, just confirming intentional:
b. Some tags have slight difference between old "Objects" and new "Subjects" count
c. Similar to discussion above on broken images - https://talk.milkywayproject.org/tags/interesting/subjects.html, but each subject (from handful tested) shows image so probably ok
Floating Forests seems to redirect to https://www.zooniverse.org/projects/zooniverse/floating-forests
Chicago Wildlife Watch looks good to me!
One small comment:
I can't remember if sites did this before this week's rebuild - but is it ok that there appear to be repeated pulls from the same discussions w/in the landing pages (e.g., https://talk.asteroidzoo.org/recent/ - 'not able to comment' board repeated several times, referencing a different post w/in that board). This matches with what's in https://talk.asteroidzoo.org/, so it seems to be what it needs to be; I just don't remember seeing this type of repeats before, so thought I'd flag. (Likely just me forgetting that this is how it is).
https://talk.floatingforests.org/recent/ is what caused me to notice this, since the 'Project still alive' board is referenced so many times in a row.
What should we do about https://talk.floatingforests.org/ not loading and being able to fully review floating forests? As noted in Slack, when we did the work to move FloatingForests to https://www.zooniverse.org/projects/zooniverse/floating-forests, it broke the old Talk.
Note: The https://talk.floatingforests.org/logs/build.log, https://talk.floatingforests.org/manifest/build.json, and https://talk.floatingforests.org/manifest/hosts.json tests (from the review doc) look good.
Random clicking around: https://talk.floatingforests.org/subjects/AKP000mrne/, https://talk.floatingforests.org/subjects/AKP00049ou/ - have broken image links, but others fine (e.g., https://talk.floatingforests.org/subjects/AKP000nlau/)
Not sure if it's the case for all, but if https://talk.floatingforests.org/subjects/AKP0000ccj/ has broken image link, it's also broken in https://talk.floatingforests.org/boards/BKP0000005/discussions/DKP000002k/ . Similarly https://talk.floatingforests.org/subjects/AKP0000ddn/ and https://talk.floatingforests.org/boards/BKP0000005/discussions/DKP000001u/, etc.
BTW, this is so cool: https://talk.floatingforests.org/collections/CKPS000046/ (the person got to classify an image that included their own research lab site).
AsteroidZoo review:
Flagging - https://talk.asteroidzoo.org/manifest/hosts.json points to {"asteroidzoo.s3.amazonaws.com":25818}
Unsure if matters: in https://talk.asteroidzoo.org/logs/build.log - many lines of caching that I hadn't seen in any of the other build logs and more 'Bad response' lines than in others. But the final 'Verifying JSON output' matches up to itself as expected, and matches https://talk.asteroidzoo.org/manifest/build.json. (Note: https://radiotalk.galaxyzoo.org/logs/build.log also has many lines of caching)
What will happen to amazon links (e.g., http://asteroidzoo.s3.amazonaws.com/CSS/703/2012/12Apr01/azoo/01_12APR01_N21022_0001-26-scaled.png) like in https://talk.asteroidzoo.org/boards/BAZ0000003/discussions/DAZ00007m2/? Understandable if those become broken links, just flagging.
Flagging that subjects don't show the 4 multi-images. E.g., https://talk.asteroidzoo.org/subjects/AAZ0000b53/ and https://talk.asteroidzoo.org/#/subjects/AAZ0000b53. Constraint to accept? Or error by mistake?
Otherwise, systematic search and random clicking around - looks good.
Plankton Portal review:
https://talk.planktonportal.org/logs/build.log, https://talk.planktonportal.org/manifest/build.json, https://talk.planktonportal.org/manifest/hosts.json tests (from the review doc) look good.
In the random clicking around, found: Broken link to http://www.planktonportal.org/#/science/field-guide within https://talk.planktonportal.org/boards/BPK0000003/discussions/DPK00000de/ But I don't think we can do anything about that.
Otherwise, systematic search and random clicking around - looks good.
https://radiotalk.galaxyzoo.org/recent Review:
https://radiotalk.galaxyzoo.org/logs/build.log, https://radiotalk.galaxyzoo.org/manifest/build.json, https://radiotalk.galaxyzoo.org/manifest/hosts.json tests (from the review doc) look good.
Minor (I don't think merits a rerun, but still noting): 'Untitled discussion' under 'Science' in https://radiotalk.galaxyzoo.org/ means that there's no link to follow in the parallel spot in https://radiotalk.galaxyzoo.org/recent/.
Similar to wanting to double check about single image vs multi-images for Asteroid Zoo, it is purposeful/known that the former Radio GZoo could scroll through a number of images for a given subject, and the new Radio GZoo cannot? I can imagine yes, this is a known constraint, but want to check. E.g., https://radiotalk.galaxyzoo.org/subjects/ARG00011tc/ vs https://radiotalk.galaxyzoo.org/#/subjects/ARG00011tc
Otherwise, systematic search and random clicking around - looks good.
Galaxy Zoo Quench Review:
Minor: Why is https://quenchtalk.galaxyzoo.org/recent/ labeled 'Galaxy Zoo Starburst' and not 'Galaxy Zoo Quench'? Not reason to rerun, just noting.
Flagging (worth discussion): In a thread like the following: https://quenchtalk.galaxyzoo.org/boards/BGS0000001/discussions/DGS00001xy/ (and many others) there are a lot of links to other threads w/in the same project's Talk; e.g., a link to http://quenchtalk.galaxyzoo.org/#/boards/BGS000000a/discussions/DGS00001xk. Once the old Talk doesn't exist anymore, there will be many broken internal links?
Flagged: it seems for most tags, there are fewer results in the new Talk than in the old Talk; e.g., https://quenchtalk.galaxyzoo.org/#/search?tags[irregular]=true and https://quenchtalk.galaxyzoo.org/tags/irregular/, https://quenchtalk.galaxyzoo.org/tags/merger/ and https://quenchtalk.galaxyzoo.org/#/search?tags[merger]=true, https://quenchtalk.galaxyzoo.org/tags/agn/, https://quenchtalk.galaxyzoo.org/#/search?tags[agn]=true, etc.
Otherwise, systematic search and random clicking around - no other issues/flags.
Minor: Why is https://quenchtalk.galaxyzoo.org/recent/ labeled 'Galaxy Zoo Starburst' and not 'Galaxy Zoo Quench'? Not reason to rerun, just noting.
Good catch. That's the project's name in Ouroboros.
Similarly, Floating Forests is called Kelp in Ouroboros and OWD is called War Diary.
Flagged: it seems for most tags, there are fewer results in the new Talk than in the old Talk; e.g., https://quenchtalk.galaxyzoo.org/#/search?tags[irregular]=true and https://quenchtalk.galaxyzoo.org/tags/irregular/, https://quenchtalk.galaxyzoo.org/tags/merger/ and https://quenchtalk.galaxyzoo.org/#/search?tags[merger]=true, https://quenchtalk.galaxyzoo.org/tags/agn/, https://quenchtalk.galaxyzoo.org/#/search?tags[agn]=true, etc.
Randomly checking https://quenchtalk.galaxyzoo.org/tags/irregular/, I'm seeing the same numbers for old and new sites: 50 subjects and 1 collection.
Flagging (worth discussion): In a thread like the following: https://quenchtalk.galaxyzoo.org/boards/BGS0000001/discussions/DGS00001xy/ (and many others) there are a lot of links to other threads w/in the same project's Talk; e.g., a link to http://quenchtalk.galaxyzoo.org/#/boards/BGS000000a/discussions/DGS00001xk. Once the old Talk doesn't exist anymore, there will be many broken internal links?
Those should still work eg. http://quenchtalk.galaxyzoo.org/recent#/boards/BGS000000a/discussions/DGS00001xk. We can check by making the project live (replacing the old index.html
page with the new one.)
What should we do about https://talk.floatingforests.org/ not loading and being able to fully review floating forests? As noted in Slack, when we did the work to move FloatingForests to https://www.zooniverse.org/projects/zooniverse/floating-forests, it broke the old Talk.
Has anyone contacted us about the old site not being available? If not, this may mean that no one is using it. 😮
Flagged: it seems for most tags, there are fewer results in the new Talk than in the old Talk; e.g., https://quenchtalk.galaxyzoo.org/#/search?tags[irregular]=true and https://quenchtalk.galaxyzoo.org/tags/irregular/, https://quenchtalk.galaxyzoo.org/tags/merger/ and https://quenchtalk.galaxyzoo.org/#/search?tags[merger]=true, https://quenchtalk.galaxyzoo.org/tags/agn/, https://quenchtalk.galaxyzoo.org/#/search?tags[agn]=true, etc.
Randomly checking https://quenchtalk.galaxyzoo.org/tags/irregular/, I'm seeing the same numbers for old and new sites: 50 subjects and 1 collection.
Strange, I see 50 (new) vs 51 (old).
And for this one, a broader split:
How odd. The first one is the same, 50 subjects + 1 collection = 51 search results. The second one has different numbers of subjects: 248 (new) vs. 250 (old.)
@camallen is it worth getting a second pair of eyes on the code that builds those tagged collections? https://github.com/zooniverse/Talk-archiver/blob/b5eb813082970ea249111ec44a3f28ed95160e79/src/helpers/tags.js#L42-L52
How odd. The first one is the same, 50 subjects + 1 collection = 51 search results. The second one has different numbers of subjects: 248 (new) vs. 250 (old.)
@camallen is it worth getting a second pair of eyes on the code that builds those tagged collections?
the code looks good to me. Re-reading the ouroboros source the tag search (and other search) used elastic search (ES) service for results. the data in the main API had to be kept in sync with the ES system and it was a wee bit notorious for failures etc.
While i can't say for sure why these discrepancies exist, i'd take the actual DB data export results (what Jim used to build the tag results) over the ES system results. Considering we're talking about a few tags here and there i think the failure of data syncing between Ouroboros API and ES is the issue here. https://github.com/zooniverse/Ouroboros/blob/5e040dd444d4c9302bee1c13fc5cf35651f2052e/lib/talk_search.rb#L64 https://github.com/zooniverse/Ouroboros/blob/5e040dd444d4c9302bee1c13fc5cf35651f2052e/lib/tasks/build_talk_search.rake
Yes, the Floating Forest researchers and their participants have had occasional uses in the past for that old Talk content (before we broke the link) and so it's good that https://talk.floatingforests.org/recent/ will exist.
https://github.com/zooniverse/Talk-archiver/issues/55#issuecomment-669798416
Summary of outstanding questions from above not yet addressed and/or resolved:
-- Noting that the old https://talk.floatingforests.org/ doesn't load so we can't do the comparison review. Reviewing https://talk.floatingforests.org/recent/ on its own looks good, except most subject images are broken; e.g, https://talk.floatingforests.org/subjects/AKP0000ccj/ and https://talk.floatingforests.org/subjects/AKP0000ddn/
-- Asteroid Zoo
1) Flagging https://talk.asteroidzoo.org/manifest/hosts.json points to {"asteroidzoo.s3.amazonaws.com":25818}.
2) Flagging that subjects don't show the 4 multi-images. E.g., https://talk.asteroidzoo.org/subjects/AAZ0000b53/ and https://talk.asteroidzoo.org/#/subjects/AAZ0000b53. Constraint to accept? Or error by mistake?
Same question for Radio Galaxy Zoo multi-images: E.g., https://radiotalk.galaxyzoo.org/subjects/ARG00011tc/ vs https://radiotalk.galaxyzoo.org/#/subjects/ARG00011tc
3) A question (worth response/clarity in this thread): What will happen to amazon links (e.g., http://asteroidzoo.s3.amazonaws.com/CSS/703/2012/12Apr01/azoo/01_12APR01_N21022_0001-26-scaled.png) like in https://talk.asteroidzoo.org/boards/BAZ0000003/discussions/DAZ00007m2/? Will those still be accessible? Or will they break? Understandable if those become broken links, but would be good to have a response here.
-- Galaxy Zoo Quench:
Flagging (worth response/clarity in this thread): In a thread like the following: https://quenchtalk.galaxyzoo.org/boards/BGS0000001/discussions/DGS00001xy/ (and many others) there are links to other threads w/in the same project's Talk; e.g., a link to http://quenchtalk.galaxyzoo.org/#/boards/BGS000000a/discussions/DGS00001xk. Once the old Talk doesn't exist anymore, will these be broken links? In a project like Quench, there are many internal references, so it'll be many broken links. Understandable if that needs to be the case, but would be good to have a response here.
-- Cyclone Center
https://github.com/zooniverse/Talk-archiver/issues/67 remains unresolved.
I've checked one of those Floating Forest subjects and Floating Forest images are broken because their URLs redirect to PFE. Here's an example. http://www.floatingforests.org/subjects/53fb88d669736d77dd66b400.jpg
Old links should still work. See the example in this comment. https://github.com/zooniverse/Talk-archiver/issues/55#issuecomment-669797122
EDIT: here's another example to test the old tag search URL fragments. https://quenchtalk.galaxyzoo.org/recent/#/search?tags[merger]=true Looks like those don't work with URL-encoding but do work if you use the unencoded URL.
https://quenchtalk.galaxyzoo.org/#/search?tags[merger]=true shows me an empty page (no search results) so I think URL-encoding breaks those direct search URLs on the old sites too.
We made a decision, quite early on, that we wouldn't build custom subject viewers for each project. Instead, each subject has a link to the full subject as JSON, including metadata and all file locations. Here's an example from Disk Detective.
Summary of outstanding questions from above not yet addressed and/or resolved:
-- Noting that the old https://talk.floatingforests.org/ doesn't load so we can't do the comparison review. Reviewing https://talk.floatingforests.org/recent/ on its own looks good, except most subject images are broken; e.g, https://talk.floatingforests.org/subjects/AKP0000ccj/ and https://talk.floatingforests.org/subjects/AKP0000ddn/
The plan is to get a fix for old talk so folks can review this site. Broken images are due to misconfigured web server, these should be working once a fix is in. More details to come.
-- Asteroid Zoo Flagging https://talk.asteroidzoo.org/manifest/hosts.json points to {"asteroidzoo.s3.amazonaws.com":25818}.
These images are hosted by the project teams on s3, we have no control over them so it's their responsibility to keep them online. Noting there are a few projects like this, milkyway is another one.
Flagging that subjects don't show the 4 multi-images. E.g., https://talk.asteroidzoo.org/subjects/AAZ0000b53/ and https://talk.asteroidzoo.org/#/subjects/AAZ0000b53. Constraint to accept? Or error by mistake?
Same question for Radio Galaxy Zoo multi-images: E.g., https://radiotalk.galaxyzoo.org/subjects/ARG00011tc/ vs https://radiotalk.galaxyzoo.org/#/subjects/ARG00011tc
This was an active choice as outlined in https://github.com/zooniverse/Talk-archiver/issues/55#issuecomment-671820668. Each old talk was customized to fit the data of the project, we chose to create a generic image subject placeholder to keep the user content alive and provide exports of the original data via links / files.
A question (worth response/clarity in this thread): What will happen to amazon links (e.g., http://asteroidzoo.s3.amazonaws.com/CSS/703/2012/12Apr01/azoo/01_12APR01_N21022_0001-26-scaled.png) like in https://talk.asteroidzoo.org/boards/BAZ0000003/discussions/DAZ00007m2/? Will those still be accessible? Or will they break? Understandable if those become broken links, but would be good to have a response here.
Any images hosted by us will be migrated and will keep working. Any owned / managed by external teams will keep working as long as they pay the hosting services (AWS s3 in this example) to keep running.
-- Galaxy Zoo Quench: Flagging (worth response/clarity in this thread): In a thread like the following: https://quenchtalk.galaxyzoo.org/boards/BGS0000001/discussions/DGS00001xy/ (and many others) there are links to other threads w/in the same project's Talk; e.g., a link to http://quenchtalk.galaxyzoo.org/#/boards/BGS000000a/discussions/DGS00001xk. Once the old Talk doesn't exist anymore, will these be broken links? In a project like Quench, there are many internal references, so it'll be many broken links. Understandable if that needs to be the case, but would be good to have a response here.
As Jim points out, old links will still work via a redirect once we make the new pages go live.
-- Cyclone Center
67 remains unresolved.
Chris is aware and we've actively made the decision to not do any more work archiving the content on this project. See https://github.com/zooniverse/Talk-archiver/issues/67#issuecomment-661841803
Ok - ouroboros talk floating forests is back online https://talk.floatingforests.org/
Just noting that I did the final review comparing https://talk.floatingforests.org/ and https://talk.floatingforests.org/recent/. Nothing new turned up. Still finding some broken images (e.g.., https://talk.floatingforests.org/subjects/AKP0008ifd/), but that's already been noted above.
Have moved the project to approved.
this is all done - sites are live and working.
Approved
Using https://docs.google.com/document/d/1sfVy7O-dQK7vgWn10-f9oqNhnh2uKIyUzSe4lKizIEA/edit for reviews