ologolo / streamline-api

Provides an API for managing file conversions
GNU Lesser General Public License v2.1
0 stars 1 forks source link

Use InputStream/OutputStream in InternalTask #21

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Modify internal task to use InputStream/OutputStream instead of File. This will 
reduce IO.

Original issue reported on code.google.com by joel.hak...@mtm.se on 3 Oct 2012 at 8:59

GoogleCodeExporter commented 9 years ago

Original comment by joel.hak...@mtm.se on 3 Oct 2012 at 12:21

GoogleCodeExporter commented 9 years ago

Original comment by joel.hak...@mtm.se on 3 Oct 2012 at 12:21

GoogleCodeExporter commented 9 years ago

Original comment by joel.hak...@mtm.se on 24 Oct 2012 at 10:26

GoogleCodeExporter commented 9 years ago
While this would reduce IO, it would trigger other issues such as:
 * XSLT processing (relative references, system id, etc.)
 * use of file name in processing

Therefore, it is unlikely to ever be resolved. However, the issue remains open 
in case the related issues are resolved due to changes in the design.

Original comment by joel.hak...@mtm.se on 31 Jul 2013 at 6:15

GoogleCodeExporter commented 9 years ago

Original comment by joel.hak...@mtm.se on 24 Sep 2013 at 7:18

GoogleCodeExporter commented 9 years ago
The issues mentioned above are not real because steps currently do not have 
access to any relative references and cannot use file name in processing 
because they are temp-files with non-descriptive names.

If needed, this information could be made available, either through system wide 
properties or by explicitly passing it to applicable steps.

There is an issue with spawning multiple streams inside a step. This could be 
solved by using a custom interface in the internal tasks that would contain the 
following method:
public InputStream newInputStream();

Dual "juggler" implementations could be created (implementing a new interface 
for "stepping"): one using file streams, one using memory streams. This would 
allow for selecting the most appropriate option depending on environment.
public void reset();
public SpawnableInputStream getInput();
public OutputStream getOutput();

Original comment by joel.hak...@mtm.se on 11 Feb 2014 at 3:49

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r805.

Original comment by joel.hak...@mtm.se on 12 Feb 2014 at 10:43

GoogleCodeExporter commented 9 years ago
Reverted for performance reasons

Original comment by joel.hak...@mtm.se on 13 Feb 2014 at 7:58

GoogleCodeExporter commented 9 years ago
For unknown reasons, using streams in tasks caused a significant performance 
problem (about 20% longer execution time), even when backed by files (which 
should be equivalent to the old version, since many steps are encapsulated in 
streams anyway). One theory is that the underlying implementations of 
XSLT-processing and validation are optimized for direct file access in some way.

Original comment by joel.hak...@mtm.se on 14 Feb 2014 at 7:58