Open falbalus opened 7 years ago
I think it's superfluous.
Removing the blog index and re-adding the blog does the exact same thing and doesn't need any additional code. All files still in the download location will be re-added to the blog index when the downloader starts to download them as the downloader checks the file sizes before downloading in order to resume any canceled downloads. If they are equal, it doesn't do anything but adds them to the index as finished files.
Almost the same would happen if I would add some rescan logic.
Ok, I'll eventually add it. Comparing the directory with the content of index file shouldn't be too much code. But for now, I think you can do the remove/readd workaround. Only works for binary files though (pictures, videos, audio).
... because the workaround never worked an inbuilt possibility would be nice.
Sure it works. What exactly should not work there?
I tried it several times.
a) Open TumblThree b) Add a blog c) Download the blog completely. d) Close TumblThree e) Delete some files physically. f) Open TumblThree g) Download with the Re-Scan
Deleted files will not be downloaded again (well at my place).
a) Open TumblThree b) Add a blog c) Download the blog completely. d) Close TumblThree e) Delete some files physically f) Open TumblThree g) Delete the blog h) Re-Add the blog
Deleted files will not be downloaded again (well at my place).
This is my experience ever since, and that is why I asked for this re-download feature ;-)
a) Open TumblThree b) Add a blog c) Download the blog completely. d) Close TumblThree e) Delete some files physically f) Open TumblThree g) Delete the blog h) Re-Add the blog
Deleted files will not be downloaded again (well at my place).
Except that they will be downloaded again. I've just tested it. Downloaded 67 images from http://mywallpapercollections.tumblr.com/ , removed 20 images. Deleted the blog (index only) from within TumblThree, re-added it and crawled it again, and 67 images are back.
It wouldn't make any sense otherwise, since all previous information of the first download is gone if you delete the index files.
Would it be possible to include file hash function or file size comparison, since i notice there's a lot of 0kb jpg/mp4 in the blog download or some video are corrupted (only downloaded half the duration) and would love that function so i can rescan the blog and fix the corrupted/missing file
thanks
I'm probably not adding anything anytime soon.
Lower your parallel connections to <= 8 (?). The tumblr servers drop all open connections every xx seconds if there are too many parallel open connection. I don't know why, but I've noticed that at the same time, all connections get closed, regardless of when they were opened.
Say, if you download 20 videos in parallel, most of them are probably not finished at that time. TumblThree opens each forcefully closed connection again for a few times as specified in "MaxNumberOfRetries" in the settings.json, but I think it's better that not even let that happen in the first place.
I haven't played around more to figure out the exact number of possible connections, but if you only download videos, I personally would set it to 8 since I remember that that number was working without the issue. I've written in this issue more about this. You can check if for yourself if you open the code in Visual Studio, download several videos in parallel and you'll see that all connections get forcefully closed at the same time.
I wasn't sure if I should change the default settings though, since most people probably download images which seem to be okay (since they are smaller).
@TehBotolSosro: What version are you using? Do you have connection issue in general? I basically never have 0 kb images. Whats you parallel connection settings set to?
v1.0.8.18 16 connection and 2 parallel
also i have tried deleting the corrupted video and deleting the 0kb file and delete the index, remove the blog, and add them again, and crawl them again.
and it downloaded the video again but it still corrupted, but when i downloaded them directly using idm the video is not corrupted (50mb mp4 video)
tried two times already, will try again with lower connection
usually in 9k file (1 blog) there is 20-40 corrupted video and dozen 0 kb file also why does the connection setting get changed to default when reopen the program even though i already save them
Okay. If it works for you with a lower number of connections I'll update the default settings and add a tool tip the the connections slider with some explanatory text.
also why does the connection setting get changed to default when reopen the program even though i already save them
? Doesn't happen here. You do actually press the save button or just close the settings window with the close button?
Okay lowering Parallel connection to 4 and increasing the timeout worked fine, the video downloaded wihtout any corruption.
i did press the save button, but i will try more later since it still downloading and don't want to mess it up
two question, if i stopped the download when it still downloading a file and later crawl them again, will the file will get resumed or not?
also do i need to delete the corrupted video before i delete the blog and readd them again, i meant will it check the hash/size? since i need to manually check them before i can delete them.
Edit: changing the connection to 4 and increase the timeout click save, exit and reopen again, and the setting get changed to 16 and timeout 120 again, is it because when i update Tumblthree i usually overwrite the old file? because the token and download folder setting is still there when i reopen the program (not reverted to default)
if i stopped the download when it still downloading a file and later crawl them again, will the file will get resumed or not? also do i need to delete the corrupted video before i delete the blog and readd them again, i meant will it check the hash/size? since i need to manually check them before i can delete them.
If the file size is smaller than it should be, it will be resumed. Thus, you shouldn't have to delete anything and re-adding the blog to the queue should finish all unfinished files. Files that had to many retries, like your video files, shouldn't be added to the blogs database and the next crawl will continue on them.
changing the connection to 4 and increase the timeout click save, exit and reopen again, and the setting get changed to 16 and timeout 120 again, is it because when i update Tumblthree i usually overwrite the old file? because the token and download folder setting is still there when i reopen the program (not reverted to default)
First time I've heard this and it has never happened to me. Doesn't make any sense to be honest. The timeout and parallel connection settings get saved at the same time, with the same method as the download folder settings. Thus, it should be the default too if the settings weren't successfully saved.
Do you run multiple instances at once? Do you use the portable mode?
thanks for the confirmation,
my portable setting was enabled and no i didn't use multiple instances at once, i also get that error that said maybe one referenced blog doesn't exist anymore,
i did a test though, i move the settings.json and reconfigure it again, and every setting is now saved and no more error refenced blog doesn't exist anymore. strangely when i restore the old settings.json it work as it should. i also check the old settings.json file permission and it's not read only and it the same as the new settings.json
so i don't know what cause that, but it can be solved by deleting the settings.json or move them, open the program so it create another settings json and just configure them again or just restore the old settings.json
thanks
so i don't know what cause that, but it can be solved by deleting the settings.json or move them, open the program so it create another settings json and just configure them again or just restore the old settings.json
Weird. The whole settings file is replaced not just parts of it. So again, it doesn't make any sense too me. Whats your windows version?
where is the setting file, I cannot find it, I removed all blogs and added again , and the same problem happened + preview became not working !! my windows version is Windows 10 Enterprise, problem with version TumblThree-v1.0.8.18
I checked earlier version, 1.0.8.6 , works fine and load blogs in queue with no error
all test is on release app , not source code
where is the setting file, I cannot find it
C'mon. Why am I writing all this? On the main page (README.md), under Application Usage -> Saved Settings.
great, deleting settings fixed the problem, but still preview not working.
Not a bug in the application itself. It works here, I haven't touched that code for months.
Did you delete all settings, including the settings.json, where the "disable/enable preview" is stored, added a new blog, and it still doesn't work?
Weird. The whole settings file is replaced not just parts of it. So again, it doesn't make any sense too me. Whats your windows version?
yes it's weird indeed as the token and download folder already set up and are not reverted to default when i save it, it just wont accept new setting (like it's read only), windows 10 enterprise
yes it's weird indeed as the token and download folder already set up and are not reverted to default when i save it, it just wont accept new setting (like it's read only), windows 10 enterprise
hmm .. maybe the setting files get mixed up?
If you use the portable mode, could you double check that the checkbox for portable mode is set in the Settings? Otherwise it will save your changes in the AppData settings, but might restore from the portable settings? If there is a settings.json next to the TumblThree.exe it will read the settings from there, but that might not be what you've just set/saved.
I'm sure it's not an application issue, but something on your side. You can also edit them manually before opening TumblThree. That will work for sure and you'll know then if the files were successfully read and/or modified..
@falbalus: For the Re-Download: If you check the "Check directory for files"-checkbox already downloaded files won't be added to the blogs db and no download/resume/replace occures if there is a file with the same file name alredady. If you don't check that checkbox, already downloaded files will be added to the blogs database, but also not re-downloaded if their file size is equal or larger than the file that would be potentially downloaded.
Thus, the checkbox is actually redundant. I'm not sure right now why I've implemented this in that way, but earlier there was no resume in the TumblThree downloader and all files were replaced once they where handled to the downloader. Maybe that's why. So, basically I can remove the checkbox since the resume in the downloader has to check anyways for already existing files, otherwise it can not continue at the right place.
@johanneszab i did check the portable mode, well anyway that problem is solved earlier and i can't reproduce the issue but i will try to edit manually if the problem occur again.
regarding the checkbox, so for me who had many 0 bytes and corrupted video it's better for me to uncheck the "check directory for files"? since i always had that checkbox checked thinking its the other way around (checking the checkbox will check the file size)
also it's possible for the program to know if the video/image are non existent (deleted by tumblr) since i have blog that the progress bar only half and tried several times to delete and readd them, until i manually check the link and turns out the image couldn't also be loaded in browser and there are other files that deleted by copyright.
@TehBotolSosro:
i did check the portable mode, well anyway that problem is solved earlier and i can't reproduce the issue but i will try to edit manually if the problem occur again.
I've noticed that the settings don't update if the queuelist cannot be saved once I tried to reproduce the the error. Thus, it probably came from the upgrade to the v1.0.8.18+ version. I actually didn't check that before releasing the newest version. I couldn't even remember that I deleted my settings since I do that so frequently during development ..
regarding the checkbox, so for me who had many 0 bytes and corrupted video it's better for me to uncheck the "check directory for files"? since i always had that checkbox checked thinking its the other way around (checking the checkbox will check the file size)
yes, turn it off. If you enable the checkbox it only checks if there is a file with the same name already. If there is one, it skips the download.
Whereas if you don't check the checkbox, the downloader will start to download the file, checks if there is any existing file and resumes it if the filesize is below the one that's going to be downloaded. If it's the same or above, it won't download anything but set that file als downloaded.
Thus, in the next release I'll remove that checkbox since it has no use.
thank you for confirming it, will uncheck the checkmark in check directory for files.
and yes i always update by overwriting the file, glad you found the issue, will delete the Queuelist.json next time i update
thanks again
While I would like an option to reset blogs (specific file types anyways, like just video so I can redownload corrupt files without having to redownload all gif/jpg ect) the fact that the app keeps tract of what has and hasn’t already been downloaded based on the file and not on the contents of the folder is very important to me, a feature I wish ripme had. I use this program as an archiver for my favorite blogs but there’s a lot of overlap in the content, the way the program works now allows me to download entire blogs and have a duplicate program scan and remove all redundant content which would otherwise bloat my HDD and after scanning I don’t have to worry about it continuously redownloading the just removed content each time I crawl for new content…like ripme does.
Would it be possible to add a re-download feature in addition to the rescan feature?
I have deleted some files physically and startet a rescan. To my opinion a rescan should determine that some files are missing, but obviously they are still "stored" somewhere in the index files.