SuperGouge / ChanThreadWatch

Fork of the original discontinued ChanThreadWatch.
90 stars 13 forks source link

Images Not Downloading #80

Closed MiraMinx closed 5 years ago

MiraMinx commented 6 years ago

Just got the program and haven't been able to get it to download any images. The HTML files download and the archiver recognizes how many images/files there are in a thread that it needs to download but it never gets past 0 completed and then after a bit just decides to cycle back to the timer until the next check.

SuperGouge commented 6 years ago

On what site are you encountering this problem? Is it happening on a specific thread? Also, can you upload the log file? On Windows, it should be found at %AppData%\Chan Thread Watch\log.txt.

MiraMinx commented 6 years ago

4chan, any thread any board. I've been looking at the log file every time I open it but it's always blank. Does that mean something?

On Mon, Mar 19, 2018, 4:51 AM SuperGouge notifications@github.com wrote:

On what site are you encountering this problem? Is it happening on a specific thread? Also, can you upload the log file? On Windows, it should be found at %AppData%\Chan Thread Watch\log.txt.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SuperGouge/ChanThreadWatch/issues/80#issuecomment-374156332, or mute the thread https://github.com/notifications/unsubscribe-auth/AjyejnFxbXAcYr-4BtI-kTB9xm4lmdjMks5tf3-4gaJpZM4SvmKJ .

SuperGouge commented 6 years ago

This should mean there is no unexpected error. It seems to work on my end, though. Could you try using another network connection or try saving the threads on a different folder/disk?

MiraMinx commented 6 years ago

Switching drive they are downloading to did nothing, switching from a wired connection to a wireless did nothing, running .exe as administrator did nothing. :(

DerSandmann-Badcode commented 6 years ago

It shouldn't matter, but could you give a bit more information on the system you're running it on?

Install Location, Operating system, and if you can find the settings file, that would be of some use as well.

Can you also include a screenshot of the UI? There's another version of CTW (The licence was updated recently) and I want to make sure you're running this one.

MiraMinx commented 6 years ago

Installed on C drive in the Downloads folder, Windows 10, Settings file attached, screenshot attached. settings.txt 2018-03-20

MiraMinx commented 6 years ago

BTW I think its fine for right now but I think it also is a recurring problem. I'm using CTW now to replace Ychan which has become unreliable for me, but I think CTW is suffering the same problem as it is on my computer. I'm uncertain what's causing it. Okay to give a full explanation as best as I can, CTW is now functioning on my PC as is Ychan at this time. BUT when CTW was giving me the problem I had described earlier yesterday and the day before so was Ychan. What happens in Ychan is that I put a thread URL in it or already have some in there from the last time I had launched the program (with it functioning right) and those threads get deleted probably from it pinging the URL and not getting any response I suppose and then deleting them as if they had 404'd as per design but mistakenly of course since the threads very much still exist. But now that's confusing enough as it is since it will work for a few days and then not work for a few days. But what makes things even more confusing is if something really is blocking the ability for these apps to see the threads why did I not get a return on the CTW saying doesn't exist (I know that's not what it says but you know what I mean) like I did on a thread just now that really did 404 and how come the CTW was able to download a copy of the html file? Oh and that does actually work, just no thumbnails and images along with it

SuperGouge commented 6 years ago

A 404'd file should be skipped and removed from the total of images to download, so if all images are 404'd you should see the counter going down (e.g. 0 of 100 to 0 of 99, etc.). Are you behind a proxy or on a public network maybe (university, etc.)?

MiraMinx commented 6 years ago

No. Home private network.

beebz0 commented 6 years ago

Adding on to here rather than making a new issue because it is the same issue. CTW 1.16 can download from every site besides 4chan at current. It will recognize that there are images in the thread and grab the html file, but it does not start the downloads. In fact if I try to close CTW now it hangs as it cannot cancel one of the 4chan threads it is trying to download from and I need to kill it with task manager to close it. Running on Windows 10 Ultimate on a home network. No VPN's or whatnot

DerSandmann-Badcode commented 6 years ago

Odd, I'll give it a go from my end again. Last time I looked into this I was not able to reproduce this issue.

beebz0 commented 6 years ago

and now something else odd: I took a look at it and it no longer had the "Downloading images" status and was in the waiting period. Stop and restarting it put it back into the stalled "Downloading images" status. The log.txt is empty.

DerSandmann-Badcode commented 6 years ago

I'm still having trouble reproducing this on my end... Anything notable on your setup? What virus protection are you using, if any?

image

SuperGouge commented 6 years ago

Working fine for me with CTW 1.16.0 on Windows 7 SP1 and Windows 10 Fall Creators Update. If you still have problems with this site, can you try accessing the images from your browser and see if it works?

It seems like 4chan was having some intermittent issues with their image server last month, this could also be linked to the issues we see arising here.

beebz0 commented 6 years ago

Images load if I try to open them in a browser. And my setup has nothing too special outside of ipv6 support. Which is funny because I just tried to wget the image directly. It failed 5 or 6 times trying to connect to an ipv6 address and then eventually passed when it tried the ipv4 address. Trying to that to other sites, it immediately attempts to connect to an ipv4 address and works. Could this be what's happening with CTW? If so, is it possible to force it to use ipv4? I don't see anything in the settings

beebz0 commented 6 years ago

yeah, as soon as I disable ipv6 support for my network, CTW works perfectly fine

DerSandmann-Badcode commented 6 years ago

I'll take a look. Thanks for digging a bit deeper into this.

jwshields commented 5 years ago

Any movement on this? Disabling IPV6 isn't really a feasible solution.

I tried digging into the code, I assume it's to do with the DNS lookup or something with WebRequest but it didn't seem as if there was any option to force 4 or 6

jwshields commented 5 years ago

I cleared my settings, blacklist, and log, tried with 1.16 and 1.17 - I was able to get the HTML to download once (no images though), but not any other times (tried with about ~10 threads) I'm on Win10 1803, have a dual stack network (behind nat) with no outbound restrictions and nothing locally (AV/firewall) blocking the program

Edited my hosts file to use the ipv4 addresses of i.4cdn.org and no change. Still not able to figure out what I did to get it to download the HTML

DerSandmann-Badcode commented 5 years ago

@jwshields It's an issue with DNS lookups. I'm pretty sure this was resolved in later versions of the .Net framework.

I can't test it myself, New Zealand isn't using Ipv6 yet. Are you able to grab the latest from the Git and change the .Net version to see if the problem still exists? If not, I could create a branch with that changed and create a release on it.

jwshields commented 5 years ago

I was able to build it on my machine, but it seems 4chan is down at the moment. Will report back once it's back up and I can test.

Another note: I tested (the released build, not mine) 1.17.0 on a Win10 machine with ipv4 only, and had similar issues of the html not downloading & no images.

jwshields commented 5 years ago

Looks like that fixed it for me. in VS I bumped the version from .net framework 2.0 to 4.7 - that got things downloading again

Edit2: Built the solution to release rather than debug, moved it to the normal directory on my disk that I use CTW from, and seeing the same issues. Creates the folder but nothing downloads like before. Unsure what's happening, I don't believe it's a file perms issue Placed the debug build into that location; this one won't save pick up settings/threads from the executable directory, but downloads things.

DerSandmann-Badcode commented 5 years ago

Thanks for confirming this, I was pretty confident .Net 2.0 has some bugs around DNS lookups. I'll see if there's a workaround for it besides using a newer version.

jwshields commented 5 years ago

I tried poking around the docs but couldn't find much of anything. But this also isn't a language I'm fluent in. Any idea about the downloading vs not with debug vs release?

DerSandmann-Badcode commented 5 years ago

@jwshields I've booted an IPV6 Windows box in California. Which directory are you running the exe from? Additionally, is it on all boards? I'm noticing some boards resolve to an IPV4 address while some are returning IPV6 addresses.

Does this say you have IPV6 enabled? https://ipv6test.google.com/

image

Even with it switching between V4 and V6 I'm not having issues when downloading images.

If you use a different DNS provider, does it still not work?

jwshields commented 5 years ago

Hmm. Have there been any changes in the settings.txt file recently? (1.15 and up) I just re-built the solution with the Release config set, and was unable to repro the issue from earlier of debug working & release not. But I had to completely wipe the directory (log, settings, blacklist, threads) - seems like it's working though.

To answer your questions though, my DNS and network are fully functional, dual stack, with nothing that is limiting how things resolve;

jwshields commented 5 years ago

Okay, sorry. Something with settings is causing an issue... I run CTW out of this directory W:\ctw\ChanThreadWatch\ I have things set to save to W:\ctw\Unsorted & W:\ctw\Unsorted-Done with a custom useragent. Put my settings.txt below-

This config stops me from downloading things. If I delete the settings file, create an empty settings.txt - the application will download things and use the default download directory; But this just doesn't want to work.

DownloadFolder=W:\ctw\Unsorted
DownloadFolderIsRelative=0
MoveToCompletedFolder=1
CompletedFolder=W:\ctw\Unsorted-Done
CompletedFolderIsRelative=0
UseCustomUserAgent=1
CustomUserAgent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.80 Safari/537.36
SaveThumbnails=1
RenameDownloadFolderWithDescription=0
RenameDownloadFolderWithCategory=0
RenameDownloadFolderWithParentThreadDescription=0
ParentThreadDescriptionFormat= ({Parent})
SortImagesByPoster=0
RecursiveAutoFollow=1
InterBoardAutoFollow=1
UseOriginalFileNames=0
VerifyImageHashes=1
UseSlug=0
SlugType=Last
CheckForUpdates=1
BlacklistWildcards=1
MinimizeToTray=0
BackupThreadList=1
BackupEvery=5
BackupCheckSize=0
MaximumBytesPerSecond=3584000
WindowTitle={ApplicationName} | {TotalThreads} | {RunningThreads} | {DeadThreads} | {StoppedThreads}
UsePageAuth=0
PageAuth=
UseImageAuth=0
ImageAuth=
OneTimeDownload=0
AutoFollow=0
CheckEvery=3
OnThreadDoubleClick=1
ClientSize=760,500
ColumnWidths=110,150,115,115,110,75
ColumnIndices=0,1,2,3,4,5
SortColumn=3
SortAscending=1
LastUpdateCheck=20181229
ChildThreadsAreNewFormat=1
jwshields commented 5 years ago

... I apologize for all the spam. I've narrowed it down to the Maximum download speed setting. I slowly added in config options until it stopped downloading things. Because this isn't a lang I know, I believe I've narrowed it down to this commit, https://github.com/SuperGouge/ChanThreadWatch/commit/9871ca1798a7fc02e0807b147923ca8a10cdacb0?diff=split Lines 82, 93, 96 of General.cs seem to be it imo

DerSandmann-Badcode commented 5 years ago

@jwshields Yep, I'm able to reproduce it on my end.

This might been broken since the stream throttling was added; When the 'first' download occurs, in the following code block, frmChanThreadWatch.ConcurrentDownloads (How many threads are downloading) is set to 0 and when we attempt to split the bandwidth between the current downloads we get a divide by 0 error. The error is then getting swallowed up the stack, so no error gets logged anywhere.

https://github.com/SuperGouge/ChanThreadWatch/blob/c02fbfe997f607a1031f7922bac607d6fd955016/Classes/Other.cs#L733-L757

@SuperGouge The OnDownloadStart code would only run after the full response returns, which would be after we've actually downloaded all of the chunks? The number of threads currently being downloaded is only being set after the response returns. https://github.com/SuperGouge/ChanThreadWatch/blob/c02fbfe997f607a1031f7922bac607d6fd955016/Classes/ThreadWatcher.cs#L1218-L1229

jwshields commented 5 years ago

Very nice find!