unregistered / Maxel

A native download accelerator for Mac OS X.
http://maxelapp.com/
28 stars 2 forks source link

Option to Automatically Skip file on Errors #95

Open jimabout opened 8 years ago

jimabout commented 8 years ago

If I enter a URL like this in Maxel:

https://vk.com/doc145274810_437834626?hash=9d90945c83c7817cc5&dl=4d66c3bd99093f5306

it will not automatically start downloading the detected file. It works fine for archive files like .zip or .rar, but for content types like pdf, png, gif, etc it pops up a dialogue box and waits for the user to click to initiate download or "Cancel" before proceeding. (I believe it follows the default behavior--at least on MacOSX--that Safari uses to determine which file types to render vs download. For example, when I hacked safari to always download instead of display pdf files, then Maxel started auto-downloading them. However, I don't want to permanently override that for gif & pngs or safari wont be much of a browser).

There are two issues here:

First of all, pausing to wait indefinitely for a response from the user defeats the purpose of the great batch "Add URLs" feature. The user is required to wait for each download in the batch to initiate, then to click download or cancel on each file. This means that one bad file can hold up the entire batch from completing.

Secondly, it would be great to be able to override in Maxel which filetypes to automatically download vs prompt the user for independently of safari (Without permanently impacting the behavior of safari/webkit and other apps that use this underlying code).

Some potential solutions:

  1. Provide an option to automatically skip/pause downloads requiring user interaction so that the rest of the files in the batch will still download. I would rather come back the next day and resolve or retry all problem downloads at once, rather than have to sit all night waiting for each one to come up and resolve.
  2. It would be great to have an option when initiating a download to have have Maxel remember this behavior for similar filetypes. A checkbox like "apply setting to all files like this", so I only have to make this decision once for each filetype.
  3. Even better might be a content-types setting in the preferences that lets me decide which file types to ask about and which file types to automatically start downloading.

I love Maxel, but this missing feature is (quite literally) keeping me up at night :) TIA

unregistered commented 8 years ago

Hi Jim

Thanks for the detailed suggestion. I need some time to look at the details but it looks like a great suggestion. Will follow up in a few days

On Oct 12, 2016, at 12:30 PM, Jim Lee notifications@github.com wrote:

If I enter a URL like this in Maxel:

https://vk.com/doc145274810_437834626?hash=9d90945c83c7817cc5&dl=4d66c3bd99093f5306

it will not automatically start downloading the detected file. It works fine for archive files like .zip or .rar, but for content types like pdf, png, gif, etc it pops up a dialogue box and waits for the user to click to initiate download or "Cancel" before proceeding. (I believe it follows the default behavior--at least on MacOSX--that Safari uses to determine which file types to render vs download. For example, when I hacked safari to always download instead of display pdf files, then Maxel started auto-downloading them. However, I don't want to permanently override that for gif & pngs or safari wont be much of a browser).

There are two issues here:

First of all, pausing to wait indefinitely for a response from the user defeats the purpose of the great batch "Add URLs" feature. The user is required to wait for each download in the batch to initiate, then to click download or cancel on each file. This means that one bad file can hold up the entire batch from completing.

Secondly, it would be great to be able to override in Maxel which filetypes to automatically download vs prompt the user for independently of safari (Without permanently impacting the behavior of safari/webkit and other apps that use this underlying code).

Some potential solutions:

  1. Provide an option to automatically skip/pause downloads requiring user interaction so that the rest of the files in the batch will still download. I would rather come back the next day and resolve or retry all problem downloads at once, rather than have to sit all night waiting for each one to come up and resolve.
  2. It would be great to have an option when initiating a download to have have Maxel remember this behavior for similar filetypes. A checkbox like "apply setting to all files like this", so I only have to make this decision once for each filetype.
  3. Even better might be a content-types setting in the preferences that lets me decide which file types to ask about and which file types to automatically start downloading.

I love Maxel, but this missing feature is (quite literally) keeping me up at night :) TIA

― You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

unregistered commented 8 years ago

Hello,

Thanks for the suggestion.

If I understand correctly, you're adding a bunch of files which all open a built-in browser window, and this happens one by one and that's annoying to deal with (and then you have to force quit and reset maybe).

I agree that the behavior when adding multiple files in batch needs some work. I'm open to brainstorming. It's tricky because each time auth is required we pause the readyQueue to deal with it, preventing the other files from starting. This is important because in the usual case where you need to do login, when you add 20 files from the same site, the first login will unlock the rest.

The next issue you pointed out refers to the website which is triggering the builtin browser instead of downloading the file. I'd like to point out that Maxel doesn't actually go by file extension when determining if it should open a link in the browser, but rather by the Content-Type tag in the headers. For this particular URL:

chris@Macbook-Pro:~/Documents/workspace/Maxel2$ curl -sD - -o /dev/null "https://vk.com/doc145274810_437834626?hash=9d90945c83c7817cc5&dl=4d66c3bd99093f5306"
HTTP/1.1 200 OK
Server: Apache
Date: Thu, 13 Oct 2016 04:46:31 GMT
Content-Type: text/html; charset=windows-1251
Content-Length: 2784
Connection: keep-alive
X-Powered-By: PHP/3.6157
Set-Cookie: remixlang=3; expires=Wed, 18 Oct 2017 07:06:47 GMT; path=/; domain=.vk.com
Pragma: no-cache
Cache-control: no-store

the content-type is html. There's no way for me to know where the actual file is, short of writing custom code for every website to extract the file.

The Chrome and Firefox plugins both do download interception, which may make downloading these files easier, otherwise you'll need a link to the file itself, which in this case is https://psv4.vk.me/c812334/u145274810/docs/571fb9b8f442/ikonki_steeldesign7.png?extra=CFQuxhKnl01nHIF2DGiTqsV7JUCFYeKjwHC87XBT8be7Qa5O3GGLHf0Yulff2rRIZJKZqdITOOJ0UD2bK0piqcchLoC0NSuykp0fqAtAsQ&dl=1

chris@Macbook-Pro:~/Documents/workspace/Maxel2$ curl -sD - -o /dev/null "https://psv4.vk.me/c812334/u145274810/docs/571fb9b8f442/ikonki_steeldesign7.png?extra=CFQuxhKnl01nHIF2DGiTqsV7JUCFYeKjwHC87XBT8be7Qa5O3GGLHf0Yulff2rRIZJKZqdITOOJ0UD2bK0piqcchLoC0NSuykp0fqAtAsQ&dl=1"
HTTP/1.1 200 OK
Server: Apache
Date: Thu, 13 Oct 2016 04:56:01 GMT
Content-Type: image/png
Content-Length: 347690
Connection: keep-alive
Last-Modified: Sun, 04 Sep 2016 04:22:38 GMT
ETag: "57cba18e-54e2a"
Expires: Thu, 20 Oct 2016 04:56:01 GMT
Cache-Control: max-age=604800
Content-Disposition: attachment
Accept-Ranges: bytes

Notice how content-type is now image/png. You'll also find that maxel can work with that link (if it isn't fresh by the time you read this, I just downloaded it in Chrome and copied the url from chrome://downloads/ )

jimabout commented 8 years ago

Thanks for the reply.

the content-type is html. There's no way for me to know where the actual

file is, short of writing custom code for every website to extract the file.

The interesting thing to me though, is I the fact that it does work fine in two other cases, even though they are doing the same thing (ie. hitting a server that at first is serving a content-type: html that then runs a redirect to the final URL of a different content type), which makes me think it should be possible in more cases.

For example, try adding these two links to Maxel: 1) This one redirects to a zip file https://vk.com/doc29354273_437544935?hash=5490043a6d490af78e&dl=9f0a527a35dbb8f872 : 2) This one redirects to a pdf file https://vk.com/doc3644257_437731712?hash=4a35554ad498efe089&dl=8e8f8acf01ecfe6c36 :

Different results, right? (by the way, my Maxel is set to "Automatically Start New Downloads" and "Skip Add Sheet for New Downloads")

Once the redirect happens, then I believe safari/webkit determines how to handle the content-type. Now, try doing this:

Open up a terminal window: ⌘space and type Terminal. Hit enter when its

highlighted. When you get to the terminal prompt, type the following command:

defaults write com.apple.Safari WebKitOmitPDFSupport -bool YES

You won't get a response at the command line, but once you restart Safari

it will no longer open them in the browser.

and then adding the 2nd pdf link above to Maxel again. For me, when I did this, then suddenly Maxel IS able to automatically start the pdf download and move down a list of pdf links like this without any interaction from me.

You are right, though, if I had a list of the final, translated URLs, any batch downloader will work great. But thats the whole trick: giving the downloader a batch of URLs, and getting it to follow the redirects to get the translated URL instead of the html file.

I can easily scrape all the links off of a page such as this https://vk.com/wall-99290991 to get my batch of links, but there is no way for me in a batch to run the html code of each link to get the translated urls. That is why I like Maxel, because it usually DOES this! (for zip, rar, 7s, psd, etc)

Previously, I would click on each link to open a bunch of tabs, which would fire the redirects and initiate the downloads, then I would open Chrome's download queue (which then contains the translated URLs), which I could then cut and paste into maxel. Then I would have to manually stop each of Chrome's downloads to avoid having them both download the file at the same time. It was a very tedious and annoying workflow.

The reason I love Maxel is because it is the only downloader I have found that actually follows redirects and will start downloading the file instead of downloading the original html page.

Is it not possible to have Maxel override the file handling behavior of webkit once it detects the new URL?

Thanks again for looking into this. Jim

On Wed, Oct 12, 2016 at 10:58 PM, Chris notifications@github.com wrote:

Hello,

Thanks for the suggestion.

If I understand correctly, you're adding a bunch of files which all open a built-in browser window, and this happens one by one and that's annoying to deal with (and then you have to force quit and reset maybe).

I agree that the behavior when adding multiple files in batch needs some work. I'm open to brainstorming. It's tricky because each time auth is required we pause the readyQueue to deal with it, preventing the other files from starting. This is important because in the usual case where you need to do login, when you add 20 files from the same site, the first login will unlock the rest.

The next issue you pointed out refers to the website which is triggering the builtin browser instead of downloading the file. I'd like to point out that Maxel doesn't actually go by filetypes when determining if it should open a link in the browser, but rather by the Content-Type tag in the headers. For this particular URL:

chris@Macbook-Pro:~/Documents/workspace/Maxel2$ curl -sD - -o /dev/null "https://vk.com/doc145274810_437834626?hash=9d90945c83c7817cc5&dl=4d66c3bd99093f5306" HTTP/1.1 200 OK Server: Apache Date: Thu, 13 Oct 2016 04:46:31 GMT Content-Type: text/html; charset=windows-1251 Content-Length: 2784 Connection: keep-alive X-Powered-By: PHP/3.6157 Set-Cookie: remixlang=3; expires=Wed, 18 Oct 2017 07:06:47 GMT; path=/; domain=.vk.com Pragma: no-cache Cache-control: no-store

the content-type is html. There's no way for me to know where the actual file is, short of writing custom code for every website to extract the file.

The Chrome and Firefox plugins both do download interception, which may make downloading these files easier, otherwise you'll need a link to the file itself, which in this case is https://psv4.vk.me/c812334/ u145274810/docs/571fb9b8f442/ikonki_steeldesign7.png?extra= CFQuxhKnl01nHIF2DGiTqsV7JUCFYeKjwHC87XBT8be7Qa5O3GGLHf0Yulff 2rRIZJKZqdITOOJ0UD2bK0piqcchLoC0NSuykp0fqAtAsQ&dl=1

chris@Macbook-Pro:~/Documents/workspace/Maxel2$ curl -sD - -o /dev/null "https://psv4.vk.me/c812334/u145274810/docs/571fb9b8f442/ikonki_steeldesign7.png?extra=CFQuxhKnl01nHIF2DGiTqsV7JUCFYeKjwHC87XBT8be7Qa5O3GGLHf0Yulff2rRIZJKZqdITOOJ0UD2bK0piqcchLoC0NSuykp0fqAtAsQ&dl=1" HTTP/1.1 200 OK Server: Apache Date: Thu, 13 Oct 2016 04:56:01 GMT Content-Type: image/png Content-Length: 347690 Connection: keep-alive Last-Modified: Sun, 04 Sep 2016 04:22:38 GMT ETag: "57cba18e-54e2a" Expires: Thu, 20 Oct 2016 04:56:01 GMT Cache-Control: max-age=604800 Content-Disposition: attachment Accept-Ranges: bytes

Notice how content-type is now image/png. You'll also find that maxel can work with that link (if it isn't fresh by the time you read this, I just downloaded it in Chrome and copied the url from chrome://downloads/ )

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/unregistered/Maxel/issues/95#issuecomment-253414621, or mute the thread https://github.com/notifications/unsubscribe-auth/ADuqpOQTIpB55mDqiiiPpVqgWoLlzn3Oks5qzbptgaJpZM4KVIW_ .

unregistered commented 8 years ago

Hi Jim,

Sorry for the delay, I've been trying to get another beta release pieced together.

I tried out the examples and while I was able to toggle safari's PDF behavior in Safari, Maxel behaved the same both times regardless of my safari setting.

Here's the code for how the browser decides whether to display it or download it:

- (BOOL)shouldDownloadMIMEType:(NSString*)type {
    // Can be overridden internally
    if (self.downloadAllMIMETypes) {
        return YES;
    }

    if ([type hasPrefix:@"audio/"] || [type hasPrefix:@"video"]) {
        return YES;
    }

    if ([type isEqualToString:@"application/pdf"] || [type isEqualToString:@"application/mp4"]) {
        return YES;
    }

    // Otherwise download things which we can't show in a browser
    return ![WebView canShowMIMEType:type];
}

Maxel defers to what Webkit thinks should be displayed in the browser as the default, but I have overrides for specific mime types: audio, video, pdf, and mp4. So this link: https://vk.com/doc3644257_437731712?hash=4a35554ad498efe089&dl=8e8f8acf01ecfe6c36 should have triggered a download in Maxel. (Also you can manually trigger Maxel to download a file that it's choosing to display by selecting the disclosure button (down arrow button) in Maxel's browser window and selecting "Download in Maxel", but I understand that can be frustrating if you have to do it to a batch of downloads).

Also this specific page doesn't serve a redirect through HTTP but in javascript:

chris@Macbook-Pro:~/Library/Developer$ curl -sD - "https://vk.com/doc3644257_437731712?hash=4a35554ad498efe089&dl=8e8f8acf01ecfe6c36"
HTTP/1.1 200 OK
Server: Apache
Date: Sun, 16 Oct 2016 23:25:54 GMT
Content-Type: text/html; charset=windows-1251
Content-Length: 2891
Connection: keep-alive
X-Powered-By: PHP/3.6276
Set-Cookie: remixlang=3; expires=Sat, 21 Oct 2017 09:22:59 GMT; path=/; domain=.vk.com
Pragma: no-cache
Cache-control: no-store

Note that an http page is served. In the page we see the javascript which loads the true link, stored in var src:

...
<script>
function saveDoc() {
  var src = 'https://psv4.vk.me/c812434/u3644257/docs/fe4ddcf5dea5/Macworld_-_August_2016_USA_vk_com_stopthepress.pdf?extra=NgUz8vFw_JIS6my5SOxUzRye3DuO1Y42acDkFDRd9klKc9LJdrlMqvN12ylOv_iBTcyksH4TnfFK6xsSupgEx1h9R5Ym8GKJ97bNxQ';
  if (src.match(/\?/)) {
    src += '&dl=1';
  } else {
    src += '?dl=1';
  }
  location.replace(src);
  return false;
}
...
<body onkeydown="if (event.keyCode == 83 && (event.ctrlKey || event.metaKey)) return saveDoc(event);" onload="onload();">

So it has to go through the built-in browser. If you were feed the direct link to Maxel, it will take it directly since it isn't a webpage.


Anyways I've sort of strayed from the main issue into technical minutia. Main takeaways:

  1. I wasn't able to repro the issue with the pdf file you included, which may indicate something else is going on I don't understand, maybe OS version or maxel version dependent. Please let me know your OS version and Maxel version and if I've missed anything.
  2. You are correct that Maxel uses a whitelist of types to automatically trigger a download, and that I could add a list for overriding in its preferences.
  3. In the long term I agree that it should be a setting.
  4. But in the short term case if you have filetypes that you think should belong on the default list I can add them to my next beta release. (you can join the beta if you aren't already in it, see Preferences > Maxel)
jimabout commented 8 years ago

Thanks, you are very cool for sticking with me on this.

According to what you said above, it sounds like perhaps I was incorrect in suggesting Maxel couldn't always automatically download PDF files. Perhaps I confused Maxel with one of the many other downloaders I have tried. If so, I really apologize for the confusion. For reference, my OS version is 10.12 and Maxel version is 2.1.1(7026).

In any case, it sounds like you could indeed add additional overrides for other file types. The only ones I have ever had trouble with are PNG, JPG and GIF. (Because the browser wants to render them instead of download them). It would be awesome if you could add those to a beta release and I could try it out.

I agree, that an editable list in preferences is ultimately the most desirable. For reference, I have attached a screenshot of how they allow something similar in Folx:

[image: Inline image 1]

[image: Inline image 2]

I applied for the beta program. Best to you. Jim

On Sun, Oct 16, 2016 at 5:40 PM, Chris notifications@github.com wrote:

Hi Jim,

Sorry for the delay, I've been trying to get another beta release pieced together.

I tried out the examples and while I was able to toggle safari's PDF behavior in Safari, Maxel behaved the same both times regardless of my safari setting.

Here's the code for how the browser decides whether to display it or download it:

  • (BOOL)shouldDownloadMIMEType:(NSString*)type { // Can be overridden internally if (self.downloadAllMIMETypes) { return YES; }

    if ([type hasPrefix:@"audio/"] || [type hasPrefix:@"video"]) { return YES; }

    if ([type isEqualToString:@"application/pdf"] || [type isEqualToString:@"application/mp4"]) { return YES; }

    // Otherwise download things which we can't show in a browser return ![WebView canShowMIMEType:type]; }

Maxel defers to what Webkit thinks should be displayed in the browser as the default, but I have overrides for specific mime types: audio, video, pdf, and mp4. So this link: https://vk.com/doc3644257_437731712?hash= 4a35554ad498efe089&dl=8e8f8acf01ecfe6c36 should have triggered a download in Maxel. (Also you can manually trigger Maxel to download a file that it's choosing to display by selecting the disclosure button (down arrow button) in Maxel's browser window and selecting "Download in Maxel", but I understand that can be frustrating if you have to do it to a batch of downloads).

Also this specific page doesn't serve a redirect through HTTP but in javascript:

chris@Macbook-Pro:~/Library/Developer$ curl -sD - "https://vk.com/doc3644257_437731712?hash=4a35554ad498efe089&dl=8e8f8acf01ecfe6c36" HTTP/1.1 200 OK Server: Apache Date: Sun, 16 Oct 2016 23:25:54 GMT Content-Type: text/html; charset=windows-1251 Content-Length: 2891 Connection: keep-alive X-Powered-By: PHP/3.6276 Set-Cookie: remixlang=3; expires=Sat, 21 Oct 2017 09:22:59 GMT; path=/; domain=.vk.com Pragma: no-cache Cache-control: no-store

Note that an http page is served. In the page we see the javascript which loads the true link, stored in var src:

...

Githubissues.
  • Githubissues is a development platform for aggregating issues.