Closed michael-conway closed 10 years ago
1.1 Logical perspective
A dataverese contains [0..n] studies; A study contains [0..n] files such as an SPSS data, image, text files; [Note: A parsable statistical file such as an SPSS data file goes through extra steps whereas non-parsable files are copied to a file system without these extra steps]
1.2 Physical(storage) perspective
The above hierarchical dataverse-study relationship is _NOT_ Mapped to the storage system, i.e., no directory hierarchy such as /${dataverese_id}/${study_id} exists.
However, the above study-files relationships are mapped to directory-files ones; for example,
A study whose StudyId is 10037 is literally mapped to a sub-directory of the local file system:
../10037
A uploaded file to the above study is ultimately stored under the above sub-directory as follows:
../10037/750
where fileId(=750) is automatically generated with its corresponding Database table.
Because Dataverse 4.0 abandoned the above logical scheme (both study and dataverse are called "dataset") and we would like to minimize our coding efforts specific to Dataverse 3.x., I decided not to touch the current Database tables of Dataverse 3.x and came up with simpler solutions for the forthcoming demos with Dataverese 3.x. One of these solutions is that a study whose files are stored on an IRODS instance _must have a specific prefix such as "ODUM-IRODS" so that the IRDOS-specific logic can be kicked in without looking up a study table. Therefore, as I mentioned in my previous e-mail, setting the ID of a new study by an API user is imperative and the following curL example shows how to do this.
// curL example
curl -k --data-binary @atom-entry-study.xml -H "Content-Type: application/atom+xml" -u akio:akio https://localhost:8181/dvn/api/data-deposit/v1/swordv2/collection/dataverse/dtdv
where "dtdv" is the alias of a target dataverse (here an implicit assumption is that an Deposit API user has this information beforehand), "atom-entry-study.xml" contains a minimum set of metadata to create a study whose ID is user-specified (see the example below), "akio" (1st) is a registered dataverse user name, "akio" (2nd) is the above user's password
// The contents of atom-entry-study.xml
<?xml version="1.0"?> <entry xmlns="http://www.w3.org/2005/Atom" xmlns:dcterms="http://purl.org/dc/terms/"> dcterms:titleirods-testing study created by an undocumented deposit API command/dcterms:title dcterms:creatorAkio Sone/dcterms:creator dcterms:identifierhdl:TEST/ODUM-IRODS_10010/dcterms:identifier
where "hdl:TEST/" in the dcterms:identifier tag is a boilerplate token when a handle-server is not specified and in terms of the aforementioned storage hierarchy "TEST" is actually the parent directory of a directory that represents a study.
The following is a sample post method : public static void testUploadStudy() { boolean local = true;
String user = "akio";
String password = "akio";
String hostname = "dvntest.irss.unc.edu"; // bart=244 lisa=246 maggie=253
if (local) {
hostname = "localhost";
}
logger.log(Level.INFO, "test machine hostname={0}", hostname);
// logger.log(Level.INFO, "newAUNameList:\n{0}", xstream.toXML(newAUNameList)); String verb = "/edit-media/study";
String alias = "/hdl:1902.29/11514";// hdl:1902.29/11514
if (local) {
alias = "/hdl:TEST/10000";// dtdv tdvn2 ddt hdl:TEST/10000
}
// dvntest: ddt
// hdl:1902.29/11514
// hdl:1902.29/11512
String verbAndAlias = verb + alias;
String portNumber = "443";// 8181 443
if (local) {
portNumber = "8181";
}
String protocol = "https";
String hostUrl = hostname + ":" + portNumber;
logger.log(Level.INFO, "hostUrl={0}", hostUrl);
String requestUrl = protocol + "://"
+ user + ":" + password + "@"
+ hostUrl
+ REQUEST_ROOT + verbAndAlias;
String zipFileName = "dvn-sample-files_5.zip";
String mimeTypeTokenZip = "application/zip";
CloseableHttpClient httpclient = null;
CloseableHttpResponse resp = null;
HttpEntity entity = null;
String failedStatus;
try {
CredentialsProvider credsProvider = new BasicCredentialsProvider();
credsProvider.setCredentials(new AuthScope(hostname,
Integer.parseInt(portNumber)),
new UsernamePasswordCredentials(
user, password));
httpclient = getCloseableHttpClient(credsProvider);
logger.log(Level.INFO, "zip-upload case: requestUrl={0}", requestUrl);
try {
HttpPost httppost = new HttpPost(requestUrl);
File zip = new File(zipFileName);
if (!zip.exists()) {
logger.log(Level.SEVERE, "zip file ({0}) was not found", zipFileName);
throw new FileNotFoundException();
} else {
logger.log(Level.INFO, "zip file ({0}) exists", zipFileName);
}
FileEntity reqEntity = new FileEntity(zip, ContentType.create(mimeTypeTokenZip));
httppost.setEntity(reqEntity);
logger.log(Level.INFO, "executing request={0}",
httppost.getRequestLine());
httppost.addHeader("Content-Type", "application/zip");
httppost.addHeader("Content-Disposition", "filename= " + zipFileName);
httppost.addHeader("Packaging", "http://purl.org/net/sword/package/SimpleZip");
resp = httpclient.execute(httppost);
int statusCode = resp.getStatusLine().getStatusCode();
logger.log(Level.INFO, "statusCode={0}", statusCode);
if (statusCode != HttpStatus.SC_CREATED) {
logger.log(Level.WARNING,
"response to http request is not OK: abort the request: status code={0}",
statusCode);
if (statusCode == HttpStatus.SC_UNAUTHORIZED) {
logger.log(Level.SEVERE, "This box ({0}) may not have created the user account", hostname);
failedStatus = "authentication failure";
} else {
failedStatus = "HttpStatusCode1=" + statusCode;
}
httppost.abort();
return;
}
entity = resp.getEntity();
String response = EntityUtils.toString(entity);
logger.log(Level.INFO, "response={0}", xstream.toXML(response));
logger.log(Level.INFO, "response={0}", response);
Builder parser = new Builder();
Document doc = parser.build(new StringReader(response));
Serializer serializer = new Serializer(System.out, "UTF-8");
serializer.setIndent(4);
serializer.write(doc);
serializer.flush();
logger.log(Level.INFO, "finishing the http/https request ");
} catch (ParsingException ex) {
logger.log(Level.SEVERE, "ParsingException", ex);
} finally {
if (resp != null) {
resp.close();
}
}
} catch (SSLPeerUnverifiedException ex) {
logger.log(Level.SEVERE, "SSLPeerUnverifiedException", ex);
ex.printStackTrace();
} catch (IOException ex) {
logger.log(Level.SEVERE, "IOException", ex);
ex.printStackTrace();
} finally {
if (httpclient != null) {
try {
httpclient.close();
} catch (IOException ex) {
logger.log(Level.SEVERE, null, ex);
}
}
}
}
The following is a quick-fix solution:
CredentialsProvider credsProvider
= new BasicCredentialsProvider();
credsProvider.setCredentials(new AuthScope(dataverseAccount.getHost(),
dataverseAccount.getPort()),
new UsernamePasswordCredentials(
dataverseAccount.getUserName(), dataverseAccount.getPassword()));
httpclient = getCloseableHttpClient(credsProvider);
static CloseableHttpClient getCloseableHttpClient(CredentialsProvider credsProvider) {
CloseableHttpClient httpclient = null;
try {
SSLContext sslcontext = SSLContexts.custom()
.loadTrustMaterial(
null, new TrustStrategy() {
public boolean isTrusted(X509Certificate[] chain,
String authType)
throws CertificateException {
return true;
}
}
)
.build();
SSLConnectionSocketFactory sslsf
= new SSLConnectionSocketFactory(sslcontext,
SSLConnectionSocketFactory.ALLOW_ALL_HOSTNAME_VERIFIER);
httpclient = HttpClients.custom()
.setSSLSocketFactory(sslsf)
.setUserAgent(USER_AGENT)
.setDefaultCredentialsProvider(credsProvider)
.build();
return httpclient;
} catch (KeyStoreException ex) {
logger.log(Level.SEVERE, "KeyStoreException", ex);
} catch (NoSuchAlgorithmException ex) {
logger.log(Level.SEVERE, "NoSuchAlgorithmException", ex);
} catch (KeyManagementException ex) {
logger.log(Level.SEVERE, "KeyManagementException", ex);
}
return httpclient;
}
create a service to be used in an indexer to move data from an iRODS grid to DVN