Open metaprime opened 6 years ago
.ripme/rip.json with URL could also be used to double check URLs ripped with output directory to help with #77 if we do have a directory naming bug in a ripper.
Wouldn't this be better handled with a custom script ?
Here's the one I'm running to auto-archive a list of URLs and send a notification to a Discord channel:
#!/bin/bash
webhook="<redacted>"
before="$(find /mnt/unionfs/\!Rips -type f | wc -l)"
java -jar /opt/ripme.jar -f '/opt/ripme-urls' -l '/mnt/unionfs/!Rips/'
after="$(find /mnt/unionfs/\!Rips -type f | wc -l)"
difference=`expr $after - $before`
urlnumber="$(wc -l /opt/ripme-urls | grep -o "[0-9]\+")"
redditnumber="$(grep -c "reddit" /opt/ripme-urls)"
instagramnumber="$(grep -c "instagram" /opt/ripme-urls)"
deviantartnumber="$(grep -c "deviantart" /opt/ripme-urls)"
facebooknumber="$(grep -c "facebook" /opt/ripme-urls)"
payload="$(cat <<EOF
{
"content": "Completed another round of archival.",
"tts": false,
"embeds": [
{
"title": "Ripped $urlnumber URLs",
"description": "A total of $difference new items have been added to the archive.",
"color": 44678,
"fields": [
{
"name": "DeviantArt",
"value": "$deviantartnumber accounts",
"inline": true
},
{
"name": "Facebook",
"value": "$facebooknumber pages",
"inline": true
},
{
"name": "Instagram",
"value": "$instagramnumber accounts",
"inline": true
},
{
"name": "Reddit",
"value": "$redditnumber subs",
"inline": true
}
]
}
]
}
EOF
)"
curl -X POST -H "Content-Type: application/json" -d "${payload}" "${webhook}"
Make a cron or a Systemd Timer to run it every X hours/days and you're set.
@cyian-1756
For social media sites and sites with user-generated content or content aggregators, it makes sense to want to subscribe to URLs previously ripped and download updates periodically.
I envision that when you start a rip for a site like reddit, instagram, etc., you would get a pop-up asking if you want to subscribe to that URL. (I think the subscription should actually link to a folder so that your rip directory setting doesn't affect ability to re-rip, which requires adding support for re-ripping directories on disk: see https://github.com/4pr0n/ripme/issues/380)
The pop-up would also ask you to check a box if you never want to be prompted to subscribe.
This feature would also come with a UI tab for subscriptions so you can add them manually, no matter whether you have auto-subscribe enabled or disabled.
This feature would offer to re-rip subscriptions at launch (again can be disabled and done manually from the subscriptions tab), and maybe opt-in to automatically re-rip on a daily basis while the app is running. (I don't want to create enable a background process for the app, to keep it light and portable. However, we could perhaps also add minimize to system tray functionality if that is possible, so that the window doesn't have to be kept open to get the subscription functionality.)
I think this would also work well with a per-output-directory file (and folder) to record some meta information and settings (rips/outputdir/.ripme/rip.json):
Newest-first sites like reddit.com/u/: The most recent URL ripped (rip can stop when there's no more work to be done) -- stopping automatically on sites like this should be an on-by-default functionality so that the check can be temporarily disabled to help re-rip in case of intermittent failures downloading content.
.ripme/rip.json would include for example
.ripme could also include a file which is the per-directory URL history (.ripme/url_history.txt)