GoogleCloudPlatform / appengine-mapreduce

A library for running MapReduce jobs on App Engine
https://github.com/GoogleCloudPlatform/appengine-mapreduce/wiki/1-MapReduce
Apache License 2.0
234 stars 109 forks source link

BigQueryGoogleCloudStorageStoreOutput: fileNamePattern wrong naming. #87

Open dlazerka opened 8 years ago

dlazerka commented 8 years ago

BigQueryGoogleCloudStorageStoreOutput requires "fileNamePattern" constructor parameter, but actually it does not accept pattern. At first I wanted to know what pattern should be there (it's undocumented), so I've read the code of SizeSegmentedGoogleCloudStorageFileOutput, which says

   /** @param fileNamePattern a java format string {@link java.util.Formatter} containing one int
   *        argument for the shard number and another int argument for the segment number for e.g.
   *        shard-%04d-segment-%04d.

And supplied a String like "foo-%04d-bar-%04d", but it later throws an exception java.util.MissingFormatArgumentException: Format specifier '04d'.

Debugged, it actually makes its own string: BigQueryFilesToLoad/Job-fileNamePattern/Shard-%04d/file-%04d

I believe constructor parameter should be renamed to something like "jobName".

aozarov commented 8 years ago

@dlazerka you are correct. Do you want to send a PR to fix it?