dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
291 stars 136 forks source link

HSM script arguments split where file name contains spaces #1806

Closed onnozweers closed 5 years ago

onnozweers commented 9 years ago

When dCache calls our HSM script to put a file name with spaces, the -si argument is split by these spaces into separate arguments.

We have a loop like this to parse arguments:

while [ $# -gt 0 ] ; do
  # log "Processing argument: $1"
  # do some things
  shift 1
done

Then these are the arguments that are received. Please note the file name "onnotest met spaties":

08/13/15-11:55:38.020624330 13167 dmfcp.sh[2471][v14.11g,main,DEBUG]: : Processing argument: 'put' 08/13/15-11:55:38.024871098 13167 dmfcp.sh[2471][v14.11g,main,DEBUG]: : Processing argument: '000029BB27BF0B6249F086806662F7DD1C83' 08/13/15-11:55:38.028737311 13167 dmfcp.sh[2471][v14.11g,main,DEBUG]: : Processing argument: '/space/atlas/tape/pool/data/000029BB27BF0B6249F086806662F7DD1C83' 08/13/15-11:55:38.032784219 13167 dmfcp.sh[2471][v14.11g,main,DEBUG]: : Processing argument: '-si=size=7;new=true;stored=false;sClass=atlas:tape;cClass=-;hsm=osm;accessLatency=NEARLINE;retentionPolicy=CUSTODIAL;path=/pnfs/grid.sara.nl/data/atlas/onnotest' 08/13/15-11:55:38.048670680 13167 dmfcp.sh[2471][v14.11g,main,DEBUG]: : Processing argument: 'met' 08/13/15-11:55:38.052595239 13167 dmfcp.sh[2471][v14.11g,main,DEBUG]: : Processing argument: 'spaties;uid=36487;gid=31306;StoreName=atlas;LinkGroupId=1;flag-c=1:0ad2027a;store=atlas;group=tape;bfid=;' 08/13/15-11:55:38.056554410 13167 dmfcp.sh[2471][v14.11g,main,DEBUG]: : Processing argument: '-hsmBase=/cxfs/TIER1SC/tape_storage' 08/13/15-11:55:38.071549532 13167 dmfcp.sh[2471][v14.11g,main,DEBUG]: : Processing argument: '-pnfs=/pnfs/ftpBase'

This looks like a bug to me: it seems that dCache calls the HSM script without quotes around the arguments. Is that so? Could it be fixed?

Kind regards Onno

kschwank commented 9 years ago

Hi Onno,

I hit something similar some time ago and expected it to be the missing quotes. In my case however it turned out that my hsm script did some parsing of the options in a loop similar to yours using some exec and awk magic which was indeed not able to cope with spaces in filenames. Could you check if that could be the problem for you as well and/or provide us with your script, so that we can help investigating? In my case, I solved the issue by parsing directly for the options I expect:

e.g. si=$(echo "$"|grep -o -e '-si=.;') mongoUrl=$(echo "$"|grep -o -e '-mongoUrl=[^ $]'|grep -o -e '[^=]*$')

Cheers, Karsten

onnozweers commented 9 years ago

Hi Karsten,

There is some additional parsing using some "eval" statements, much the same way as the HSM example script in the skel dir: https://github.com/dCache/dcache/blob/master/skel/share/lib/hsmcp.sh. However, I don't think that's the issue here; the while loop I posted is the first thing that touches the arguments and there already it's split. I added the logging directly below the while statement, before anything else is being done with the argument, to be sure what was going on.

IMHO the arguments should be quoted before passed on to the script, because almost anything can be in the file name, even things like: ' - ' ' -somethingthatlookslikethenextargument' ... or double spaces, which make it extremely difficult to reconstruct arguments that have been accidentally split.

Does that answer your question?

Kind regards Onno

kschwank commented 9 years ago

Hi Onno,

yes, that was the same parameter parsing routing I used before. For some reason, that I still have to figure out it, quoting the parameters didn't entirely fix the problem for me, so I still ended up with with targeted parsing approach mentioned above. However, it sounds reasonable to quote the parameters, after all, I reopened the patch I created earlier https://rb.dcache.org/r/8408/

Cheers, Karsten

onnozweers commented 9 years ago

Thanks for the link to the RB ticket. It seems to me that Gerd is on the right track with his suggestion that the HSM script should be called as command + argument list instead of as a white space separated string.

gbehrmann commented 9 years ago

Karsten, didn't you submit a fix for this?

kschwank commented 9 years ago

Yes, but it is only in master, yet (9a5c11797f33e58a65cee0635e3f4efb7471fba5)

kschwank commented 9 years ago

It is now also in 2.13

gbehrmann commented 9 years ago

@kschwank, if this bug is fixed, then please close it.

kofemann commented 5 years ago

@onnozweers Can we close this issue?

onnozweers commented 5 years ago

I haven't seen the issue anymore so I assume it has been fixed successfully. Thanks!