Closed francoisprunier closed 3 years ago
One problem with that is that a Web server does not have much processing power for doing that: https://issues.cloudera.org/browse/HUE-1112
A better solution would be to have a task server but that's a big new dependency. Even better might be to run a job under the hood to create this new zip on the grid (that way it will scale) and it won't require any big depdency (just a job template)
@sai-krish FYI, after upload archive, would be a nice one
As Hue uses Knockout.js under the covers, you can run the following code from your browser's console to download all files on the given page:
for(var i = 0; i < viewModel.files().length; i++){
if(viewModel.files()[i].type === "file"){
window.open("/filebrowser/download=/" + viewModel.files()[i].path);
}
}
Similar request from my side, if you are lucky and your files have an index in their pattern use browser "copy as curl" and then script via bash. Otherwise if your files have very dynamic patterns you need to add another script step to fetch the list of file-names.
If files are too large for stdout then use wget or curl towards are file.
for i in {0..99}; do curl "https://hue-endpoint.com/filebrowser/download=/your-path/part-0000{$i}"; done >> /tmp/content
Maybe a v1 with Joseph idea, then an Oozie batch job to compress everything into a single zip?
@manuelAldana I tried curl syntax mentioned in your solution, but no matter what option I specify, it either ends up giving me a 302 (tried -L option to resolve it) or ends up dumping javascript code for Hue welcome page. Appreciate if you could give me any pointers as to what i might be doing wrong.
Commands I have tried
curl --cacert
curl -L --cacert
this prints the javascript for Hue home page
I would also like to download folders. My schools research cluster only allows us to use Hue, not SSH. So thats the only way we can interface with it.
Currently, there is a 'Compress' action that will zip multiple files or folders together, so that they can be downloaded in one file (similarly to Google Drive)
Thanks, I'm having difficulty finding documentation on the "compress" action in hue. As this returns a lot of false positives about file compression in Hadoop and Hue. Can you point me in the right direction? We have CDH Version 5.14 and Hue Version 4.1
In the filebrowser, if you have ENABLE_EXTRACT_UPLOADED_ARCHIVE enabled (which is enabled by default), you can select a file and from the actions dropdown, you can select the compress option. This was added with HUE-5506.
On Tue, Aug 21, 2018 at 6:50 AM Gabriel Fair notifications@github.com wrote:
Thanks, I'm having difficulty finding documentation on the "compress" action in hue. Can you point me in the right direction? Thanks
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cloudera/hue/issues/208#issuecomment-414680941, or mute the thread https://github.com/notifications/unsubscribe-auth/AHCQz974C-C7DtGAsVuJwHnYPh6L7b4gks5uTBBCgaJpZM4FZt7k .
Hi @jdesjean I am using latest Cloudera 5.15.1 and when I try to do compress on a directory this is what I get in my Oozie logs:
>>> Invoking Shell command line now >>
Stdoutput Created temporary output directory: /tmp/tmp.OB54gSmukf
Stdoutput Deleted temporary output directory: /tmp/tmp.OB54gSmukf
Exit code of the Shell command 1
<<< Invocation of Shell command completed <<<
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://master:8020/user/maziyar/oozie-oozi/0000004-180827220407743-oozie-oozi-W/shell-19a1--shell/action-data.seq
Successfully reset security manager from org.apache.oozie.action.hadoop.LauncherSecurityManager@7876d598 to null
Oozie Launcher ends
@maziyarpanahi: You can get more information about the problem by looking at the job log instead of the workflow log. That being said, I suspect that you don't have the correct directory permissions where you are trying to execute the compress operation. The job needs to be able to copy the tmp file to the target directory.
Thanks @jdesjean for the respond. I am actually compressing it in the same directory which is my own directory with full permissions. Also, I copied those logs from the job logs. I will take another look at it, thanks again.
As Hue uses Knockout.js under the covers, you can run the following code from your browser's console to download all files on the given page:
for(var i = 0; i < viewModel.files().length; i++){ if(viewModel.files()[i].type === "file"){ window.open("/filebrowser/download=/" + viewModel.files()[i].path); } }
It is also possible to brow through folders and download the contents as you described?
This issue is stale because it has been open 30 days with no activity and is not "roadmap" labeled or part of any milestone. Remove stale label or comment or this will be closed in 5 days.
Hi,
As far as I'm aware, it's not possible to download multiple files at once. It would be nice to be able to multi-select files in the file browser and download a zip file. It's especially useful if you want to download the content of a parquet directory for example, to get the data and meta-data at once.
I'm not a python guy, but if you point me in the right direction in where the code is and the general idea of how you'd do it, I'll try to send a PR with the enhancement.
Thanks, François