OpenframeProject / Openframe-WebApp

A responsive front-end web app for Openframe.
GNU Affero General Public License v3.0
2 stars 4 forks source link

Clean up public stream #7

Open jvolker opened 4 years ago

jvolker commented 4 years ago

Only artworks that are online should be displayed. And in order to encourage users to add preview images the ones without preview image should show up only after the ones with a preview image.

This topic has been discussed here: https://groups.google.com/forum/#!topic/openframeio/4OeIVWsowHA

@jmwohl:

We could probably run a scheduled task that would look for missing art work files and either mark or remove them from the stream and notify their author. Not too complicated, if anyone wants to work on something like that it would happen in the Openframe-APIServer repo. Probably could be an admin authorized endpoint that gets hit via cron?

jvolker commented 4 years ago

I've written a script that checks and logs the availability of the artwork and thumbnail URLs. It also gives some statistics. It could easily be extended to push the results back to the API once the API is prepared for this: https://github.com/jvolker/Openframe-ArtworkAvailabilityChecker

The result

It's not as bad as I thought. Only 31 artworks are unavailable. Even though available artwork could still fail for other reasons (like wrong file type) while loading. What I find really interesting is that 65% of all available artworks are shaders. We've also got some duplicate artwork URLs.

Found 324 public artworks.

57841827c0006da8310e8e69 Invalid URL meetar.github.io/terrain-tour
56e4dc3f44973147579abbef ENOTFOUND server can not be found 
57041cdee1f87ce61cc0af07 ENOTFOUND server can not be found 
572114b0c2cb33000be1eea2 ENOTFOUND server can not be found 
570c0c7c708dc5311ca1e4c8 404 Not found 
570f716e507bfb8922c89814 403 Forbidden 
56e0c15e17bbab454407c2b7 404 Not found 
57a85d5bc0006da8310e906b 404 Not found 
576ae072c0006da8310e8b89 404 Not found 
57640ff1c0006da8310e8a94 404 Not found 
57640f99c0006da8310e8a90 404 Not found 
59711251b2462d7382b8f530 404 Not found 
5970fe40b2462d7382b8f52f 404 Not found 
597219fbb2462d7382b8f53f 404 Not found 
5972196db2462d7382b8f53e 404 Not found 
59711293b2462d7382b8f534 404 Not found 
588d38697cb7f28d67893075 404 Not found 
5a7876c3b2462d7382b8fa63 403 Forbidden 
576994d0c0006da8310e8b6c 404 Not found 
5af67553a38167076035b505 403 Forbidden 
5cd6da5aa38167076035bc00 403 Forbidden 
5af67510a38167076035b504 403 Forbidden 
5af67576a38167076035b506 403 Forbidden 
56e5c923a6b560d606184662 404 Not found 
57846752c0006da8310e8e89 404 Not found 
5b3d1c31a38167076035b5d6 403 Forbidden 
584d65957cb7f28d67892d8e 403 Forbidden 
5896d4c6a9c1b11803b240fb 404 Not found 
59b615adb2462d7382b8f5de 404 Not found 
59b615d1b2462d7382b8f5df 404 Not found 
59b61573b2462d7382b8f5dd 404 Not found 

293/324 artworks available
31/324 artworks unavailable
283/324 thumbnails available

Artwork type counts:
openframe-glslviewer: 189
openframe-image: 82
openframe-website: 13
openframe-video: 5
openframe-of: 2
openframe-processing: 2

Duplicate artwork URLs:
https://thebookofshaders.com/log/160306213426.frag: 2
https://goo.gl/images/Y1weHm: 2
https://goo.gl/images/j9YfL2: 2
https://thebookofshaders.com/log/160414134236.frag: 2
jmwohl commented 4 years ago

Nice, this is great! Yeah, since shaders were one of our primary use cases and we had a good collaboration with Patricio from The Book of Shaders (including importing directly to openframe from BoS), it's not surprising to me that such a large proportion are shaders.

Should be a relatively easy next step to remove unavailable artworks from the stream — I think we should be able to just use the JS client to update the unavailable artworks, setting is_public to false. This way no artworks are deleted unexpectedly from a user's account, but they won't show up in the stream. It would have to be run by a super user that can modify all artworks, which I can do. Does that make sense?

jvolker commented 4 years ago

including importing directly to openframe from BoS), it's not surprising to me that such a large proportion are shaders.

That's what I thought. Shows what potential a web clipper could have.

setting is_public to false. This way no artworks are deleted unexpectedly from a user's account, but they won't show up in the stream. It would have to be run by a super user that can modify all artworks, which I can do.

I was thinking of doing it more thoroughly, but it might be over the top:

Is there some sort of email template available in the web app or server that could be reused or should this script have a separate one?

jvolker commented 4 years ago

@jmwohl Thanks for fixing the login issue in the JSclient so quickly.

I've updated the script a little more. Uncommenting line 92 updateDatabase(artwork, false) should update the artwork in the database and set is_public to false. Be aware, though. It's untested yet since I don't have a test database or superuser rights! Let me know if you instead like to go the more thorough way I've described above.

jmwohl commented 4 years ago

Finally finding a bit of time to get back to this. Running with a superuser account didn't work quite as I'd hoped and rather than mess around with it I used the output list of unavailable artwork IDs to run an update against mongo directly, making these artworks private and thus removing them from the stream.

There are still a number of artworks with missing preview images, but I'm not sure we should automatically remove those.

As you've suggested, we could automate this process so that the stream remains clean. I'd like to get a local server env running in order to test stuff on but haven't had time to set that up yet — I'd really like to go through and update dependencies as part of that, but that might be too ambitious.