cloudera / hue

Open source SQL Query Assistant service for Databases/Warehouses
https://cloudera.com
Apache License 2.0
1.16k stars 365 forks source link

File browser - Download multiple files at once #208

Closed francoisprunier closed 3 years ago

francoisprunier commented 9 years ago

Hi,

As far as I'm aware, it's not possible to download multiple files at once. It would be nice to be able to multi-select files in the file browser and download a zip file. It's especially useful if you want to download the content of a parquet directory for example, to get the data and meta-data at once.

I'm not a python guy, but if you point me in the right direction in where the code is and the general idea of how you'd do it, I'll try to send a PR with the enhancement.

Thanks, François

romainr commented 9 years ago

One problem with that is that a Web server does not have much processing power for doing that: https://issues.cloudera.org/browse/HUE-1112

A better solution would be to have a task server but that's a big new dependency. Even better might be to run a job under the hood to create this new zip on the grid (that way it will scale) and it won't require any big depdency (just a job template)

romainr commented 7 years ago

@sai-krish FYI, after upload archive, would be a nice one

josephpconley commented 7 years ago

As Hue uses Knockout.js under the covers, you can run the following code from your browser's console to download all files on the given page:

for(var i = 0; i < viewModel.files().length; i++){
     if(viewModel.files()[i].type === "file"){
          window.open("/filebrowser/download=/" + viewModel.files()[i].path);
     }
}
manuelAldana commented 7 years ago

Similar request from my side, if you are lucky and your files have an index in their pattern use browser "copy as curl" and then script via bash. Otherwise if your files have very dynamic patterns you need to add another script step to fetch the list of file-names.

If files are too large for stdout then use wget or curl towards are file.

 for i in {0..99}; do curl "https://hue-endpoint.com/filebrowser/download=/your-path/part-0000{$i}"; done >> /tmp/content
romainr commented 7 years ago

Maybe a v1 with Joseph idea, then an Oozie batch job to compress everything into a single zip?

mailtorichasharma commented 7 years ago

@manuelAldana I tried curl syntax mentioned in your solution, but no matter what option I specify, it either ends up giving me a 302 (tried -L option to resolve it) or ends up dumping javascript code for Hue welcome page. Appreciate if you could give me any pointers as to what i might be doing wrong.

Commands I have tried curl --cacert "https://hue-endpoint.com/filebrowser/download=//" --verbose gives -> HTTP/1.1 302 FOUND

curl -L --cacert "https://hue-endpoint.com/filebrowser/download=//" --verbose

this prints the javascript for Hue home page

gabefair commented 6 years ago

I would also like to download folders. My schools research cluster only allows us to use Hue, not SSH. So thats the only way we can interface with it.

romainr commented 6 years ago

Currently, there is a 'Compress' action that will zip multiple files or folders together, so that they can be downloaded in one file (similarly to Google Drive)

gabefair commented 6 years ago

Thanks, I'm having difficulty finding documentation on the "compress" action in hue. As this returns a lot of false positives about file compression in Hadoop and Hue. Can you point me in the right direction? We have CDH Version 5.14 and Hue Version 4.1

jdesjean commented 6 years ago

In the filebrowser, if you have ENABLE_EXTRACT_UPLOADED_ARCHIVE enabled (which is enabled by default), you can select a file and from the actions dropdown, you can select the compress option. This was added with HUE-5506.

On Tue, Aug 21, 2018 at 6:50 AM Gabriel Fair notifications@github.com wrote:

Thanks, I'm having difficulty finding documentation on the "compress" action in hue. Can you point me in the right direction? Thanks

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cloudera/hue/issues/208#issuecomment-414680941, or mute the thread https://github.com/notifications/unsubscribe-auth/AHCQz974C-C7DtGAsVuJwHnYPh6L7b4gks5uTBBCgaJpZM4FZt7k .

maziyarpanahi commented 6 years ago

Hi @jdesjean I am using latest Cloudera 5.15.1 and when I try to do compress on a directory this is what I get in my Oozie logs:

>>> Invoking Shell command line now >>

Stdoutput Created temporary output directory: /tmp/tmp.OB54gSmukf
Stdoutput Deleted temporary output directory: /tmp/tmp.OB54gSmukf
Exit code of the Shell command 1
<<< Invocation of Shell command completed <<<

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]

Oozie Launcher failed, finishing Hadoop job gracefully

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://master:8020/user/maziyar/oozie-oozi/0000004-180827220407743-oozie-oozi-W/shell-19a1--shell/action-data.seq
Successfully reset security manager from org.apache.oozie.action.hadoop.LauncherSecurityManager@7876d598 to null

Oozie Launcher ends
jdesjean commented 6 years ago

@maziyarpanahi: You can get more information about the problem by looking at the job log instead of the workflow log. That being said, I suspect that you don't have the correct directory permissions where you are trying to execute the compress operation. The job needs to be able to copy the tmp file to the target directory.

maziyarpanahi commented 6 years ago

Thanks @jdesjean for the respond. I am actually compressing it in the same directory which is my own directory with full permissions. Also, I copied those logs from the job logs. I will take another look at it, thanks again.

pankaj-kvhld commented 3 years ago

As Hue uses Knockout.js under the covers, you can run the following code from your browser's console to download all files on the given page:

for(var i = 0; i < viewModel.files().length; i++){
     if(viewModel.files()[i].type === "file"){
          window.open("/filebrowser/download=/" + viewModel.files()[i].path);
     }
}

It is also possible to brow through folders and download the contents as you described?

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity and is not "roadmap" labeled or part of any milestone. Remove stale label or comment or this will be closed in 5 days.