peak / s5cmd

Parallel S3 and local filesystem execution tool.
MIT License
2.71k stars 240 forks source link

Cant capture the error when we run S5CMD for multiple commands via Java Process Builder #729

Closed anudina closed 4 months ago

anudina commented 4 months ago

Hi team, I did use the multiple commands run using command file . I try running the S5CMD command using Java ProcessBuilder as below

Code:

processBuilder = new ProcessBuilder(); String command = "export AWS_ACCESS_KEY_ID=XXXXXXXXXX; export AWS_SECRET_ACCESS_KEY=YYYYYYYYY; s5cmd --endpoint-url=endpoint run commandFile"; processBuilder.command("bash", "-c", command); try{ processBuilder.redirectErrorStream(true); process = processBuilder.start(); InputStream stderr = process.getInputStream(); InputStreamReader isr = new InputStreamReader(stderr); BufferedReader br = new BufferedReader(isr); String line = null; while ((line = br.readLine()) != null) { System.out.println(line); } }

My Command File has below content commandFile

cp s3://XXXXXYYYYXXX/SomePATH/641410_CENTINELA_statsfile1.xml.gz    /usr/apps/DestinationDirectory
cp s3://XXXXXYYYYXXX/SomePATH/641410_CENTINELA_statsfile2.xml.gz   /usr/apps/DestinationDirectory
cp s3://XXXXXYYYYXXX/SomePATH/641410_CENTINELA_statsfile3.xml.gz   /usr/apps/DestinationDirectory
cp s3://XXXXXYYYYXXX/SomePATH/641410_CENTINELA_statsfile4.xml.gz   /usr/apps/DestinationDirectory

When the 641410_CENTINELA_statsfile1--- 641410_CENTINELA_statsfile4 are present they get copied without any issues. Assume that if we dont have proper file we should get an exception but in the processbuilder I dont see any erros being logged. Either error stream or input Stream.

Same way when I execute S5CMD via command line without process builder it give me proper error as below ERROR "cp s3://XXXXXYYYYXXX/SomePATH/641410_CENTINELA_statsfile1.xml.gz /usr/apps/DestinationDirectory": NoSuchKey: The specified key does not exist. status code: 404, request id: 97788ef2-ffef-1fff-87c6-043f72cf300a, host id: ERROR "cp s3://XXXXXYYYYXXX/SomePATH/641410_CENTINELA_statsfile2.xml.gz /usr/apps/DestinationDirectory": NoSuchKey: The specified key does not exist. status code: 404, request id: 97788ef2-ffef-1fff-87c6-043f72cf300a, host id: ERROR "cp s3://XXXXXYYYYXXX/SomePATH/641410_CENTINELA_statsfile3.xml.gz /usr/apps/DestinationDirectory": NoSuchKey: The specified key does not exist. status code: 404, request id: 97788ef2-ffef-1fff-87c6-043f72cf300a, host id: ERROR "cp s3://XXXXXYYYYXXX/SomePATH/641410_CENTINELA_statsfile4.xml.gz /usr/apps/DestinationDirectory": NoSuchKey: The specified key does not exist. status code: 404, request id: 97788ef2-ffef-1fff-87c6-043f72cf300a, host id:

why S5CMD erros are not sent to Java process builder or how. we can use this effectively. We want to rerun the failed one later point in time
igungor commented 4 months ago

Hello,

I can't help you with using s5cmd in Java, but the run command writes success and error messages to stdout and stderr, respectively.

Here's what I tried:

$ cat run.txt
cp existent-file     s3://bucket/ibrahim/
cp non-existent-file s3://bucket/ibrahim/

$ s5cmd --json run run.txt 1>out 2>err
$ echo $?
1

$ cat out
{"operation":"cp","success":true,"source":"existent-file,"destination":"s3://bucket/ibrahim/existent-file","object":{"type":"file","size":1987}}

$ cat err
{"operation":"cp","command":"cp non-existent-file s3://bucket/ibrahim/","error":"given object non-existent-file not found"}