Open tinshuksingh opened 6 years ago
Hi Tinshuk -
The Uploader tool is included in our automated test suite in our environment so I believe it should not be too difficult to get working in your environment. And it's a good indication that the pre-registration worked.
I would like to collect some information. But first -- I bet you already discovered the swagger docs that ship with each release. Actually I think this is in the CloudFormation output but so maybe you did not see it. But the docs are at /herd-app/docs/rest/index.html and they will help you with the Storages GET below and many other REST calls you will be making in the future!
Please send:
I am also tagging @kenisteward here who can help troubleshoot. Thanks Tinshuk, Keni!
Hi @nateiam,
Please find details you asked,
Output from Storages GET for S3StorageUnit: { "name": "S3StorageUnit", "storagePlatformName": "S3", "attributes": [] }
java -jar herd-uploader-0.63.0.jar -a xxx -p xxx/xxx -l /home/xxx-user/herd-uploader -m manifest.json -H ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com -P 8080
@tinshuksingh
When the uploader tries the actual upload, it uses the BDATA"s storage.directorypath to go to the actual s3 place.
It looks like your storage doesn't have the attributes that tells where your s3 path is. If you could, try doing a stoarge put on the following attributes:
{
"attributes": [
{
"name": "bucket.name",
"value": "yourBucketName"
}
]
}
If this doesn't work let us know. We think this should fix it with minimal changes but there are other knobs we can tweak.
@kenisteward
I updated the storage with attributes as:
{
"name": "S3StorageUnit",
"storagePlatformName": "S3",
"attributes": [
{
"name": "bucket.name",
"value": "bucketName"
}
]
}
but getting same error as earlier I mentioned.
@tinshuksingh
Gotcha. Looks like you need to set the keyPrefix for the storage since you can't set the storage directory in the manifest.json. Maybe we can make that a feature of uploader? @nateiam
{
"name": "S3StorageUnit",
"storagePlatformName": "S3",
"attributes": [
{
"name": "bucket.name",
"value": "bucketName"
},
{
"name": "key.prefix.velocity.template",
"value": "your/velocity/key/prefix"
}
]
}
It looks like with the herd-uploader's manifest.json, you aren't actually allowed to specify the storage directory. Because of this, you'll have to setup the directory path via the storage's key.prefix.velocity.template.
This can be any string. It also has replaceable values that are:
S3 Key Prefix Velocity Template $environment | The environment name. $namespace | The namespace code. $dataProviderName | The data provider name. $businessObjectDefinitionName | The name of the business object definition. $businessObjectFormatUsage | The business object format usage. $businessObjectFormatFileType | The business object format file type. $businessObjectFormatVersion | The version of the business object format. $businessObjectDataVersion | The version of the business object data. $businessObjectFormatPartitionKey | The partition key which must be pre-registered as part of the business object format. $businessObjectDataPartitionValue | The business object data primary partition value. $businessObjectDataPartitions | The ordered map of sub-partition column names to sub-partition values. $CollectionUtils | org.apache.commons.collections4.CollectionUtils.class
Examples:
$environment/$namespace/$businessObjectDataPartitionValue
$namespace/some/random/choices/$businessObjectFormatFileType/$businessObjectDataPartitionValue
@tinshuksingh Are you still having any issues?
Hi Team,
We created
business object definition
and now trying to upload file to S3 bucket usingherd-uploader-0.63.0.jar
from ec2 instance.We followed the wiki steps which you mentioned to upload file to S3.
We are able to pre-register the
business object
with registration server successfully but after that gettingnull pointer exception
while reading the directory path.manifest.json
{ "namespace": "S3UploadNamespace", "businessObjectDefinitionName": "S3BusinessDefination", "businessObjectFormatUsage": "PRC", "businessObjectFormatFileType": "TXT", "businessObjectFormatVersion": "0", "partitionKey": "PROCESS_DATE", "partitionValue": "2014-04-01", "storageName": "S3StorageUnit", "subPartitionValues": [ "2014-04-01" ], "manifestFiles" : [ { "fileName" : "testFile1.gz", "rowCount" : 0 }, { "fileName" : "testFile2.gz", "rowCount" : 0 } ] }We tried to pass directory path in
manifest.json
file but it was failing with exceptionUnrecognizedPropertyException
.manifest.json
{ "namespace": "S3UploadNamespace", "businessObjectDefinitionName": "S3BusinessDefination", "businessObjectFormatUsage": "PRC", "businessObjectFormatFileType": "TXT", "businessObjectFormatVersion": "0", "partitionKey": "PROCESS_DATE", "partitionValue": "2014-04-01", "storageUnits": [ { "storageName": "S3StorageUnit", "storageDirectory": { "directoryPath": "Herd_poc_bucket" }, "storageFiles": [ { "filePath": "testFile1.txt", "fileSizeBytes": 0, "rowCount": 0 } ], "discoverStorageFiles": true }], "subPartitionValues": [ "2014-04-01" ], "manifestFiles" : [ { "fileName" : "testFile1.gz", "rowCount" : 0 }, { "fileName" : "testFile2.gz", "rowCount" : 0 } ] }Please let us know if we are missing anything.
Thanks, Tinshuk