Closed vitalielias closed 1 year ago
I think you need to provide a default value for "output_zip_path" in the process_zip function, just as you did in the workFlow function for resultsPath = 'defResults.json', that you previously used to generate the output json file.
The first issue is a PYTHON PATH Error -- it needs to be configured in src/main/resources/application.properties each time a new instance of the mapping service is set up. Thanks Nicolas for pointing that out!
The second issue appears to be a problem with the file extension guesser in src/main/java/edu/kit/datamanager/mappingservice/util/FileUtil.java:
private static String guessFileExtension(byte[] schema) {
// Cut schema to a maximum of MAX_LENGTH_OF_HEADER characters.
int length = Math.min(schema.length, MAX_LENGTH_OF_HEADER);
String schemaAsString = new String(schema, 0, length);
LOGGER.trace("Guess type for '{}'", schemaAsString);
Matcher m = JSON_FIRST_BYTE.matcher(schemaAsString);
if (m.matches()) {
return ".json";
} else {
m = XML_FIRST_BYTE.matcher(schemaAsString);
if (m.matches()) {
return ".xml";
}
}
return null;
}
When a results file is created, it is given an arbitrary, temporary .result
extension. This function then renames it by guessing the file extension but it is "guessing wrong" by applying a json file extension to the results file. It needs to be fixed to either force zip output, or have better guessing.
A fix may be to remove this temporary renaming of the file extensions for result files, which is in the same file:
public static Path createTempFile(String prefix, String suffix) {
Path tempFile;
prefix = (prefix == null || prefix.trim().isEmpty()) ? DEFAULT_PREFIX : prefix;
suffix = (suffix == null || suffix.trim().isEmpty() || suffix.trim().equals(".")) ? DEFAULT_SUFFIX : suffix;
try {
tempFile = Files.createTempFile(prefix, suffix);
} catch (IOException ioe) {
throw new MappingException("Error creating tmp file!", ioe);
}
return tempFile;
}
This problem is caused during the creation of the temporary result file path in the impl/MappingService.java class
The mapping service needs to know where the result file of a plugin is stored. Otherwise it wouldn't be able to return it to the user. For this reason a filename (ending on .result
) is generated, stored and given to the plugin, which uses it to store the result. After the execution of the plugin the .result
file is sent to the user. Since the user wants a correct file extension the guessFileExtension()
method is called. This method checks the first bytes of the file for a specific (hardcoded) signature and modifies the extension to either .json
or .xml
. This doesn't work for .zip
files. This is the reason why a file with an incorrect file extension is returned.
A solution for this problem would be to modify the IMappingPlugin
to add a result file extension string. This can be used in the MappingService
class to fix the file extension. A disadvantage of this solution would be that all existing plugins have to be modified to be compliant to the fixed IMappingPlugin
and therefore be usable with the fixed mapping service.
A fix has been tentatively implemented by @VolkerHartmann (see 21cf3de). I have yet to test it with the current working plugins, and this will be the task tomorrow so this issue can hopefully be closed.
After testing, the mapping service still insists on spitting out a json file as output, despite the underlying python script ran by the plugin ejecting a .zip file. This is now a priority as all other work is reliant on this.
This should be solved in the meantime. I'll close the issue for now. If the error still occurs, please re-open the issue or create a new one.
There appears to be an issue when the Python code is executed with the new plugin for batch processing.
dateutil package not found?
The first issue is the
ModuleNotFoundError
in the Python code itself. Thedateutil
package should be a part of the default Python libraries which come in the installation, and this has not been an issue until this new plugin. It's possible that a simple pip upgrade will fix, but I'm not sure in what Python environment the mapping service is running in. I tried to upgrade my local environment but this did not work.no resultPath?
The second issue may be related to the first, but it seems it is looking to pass a variable called
resultPath
to the execution of the Python code, that or it's looking for somewhere to store the resulting file, but I'm not sure because I searched through all lines of the plugin code and could not find a variable calledresultPath
(I have one in my Python code calledresultsPath
but that is a local variable within a function and unrelated I believe).The mapping service is still trying to output a JSON file after processing the zip file. When I test my Python code on its own by giving it a zip file, it returns the correct output: a zip file containing the json documents of all TIFFs in the input zip file. However when integrated in the mapping service, the service is still trying to output a JSON file.
Mapping Service log when trying to upload a zip file