imagej / imagej-common

ImageJ core data model
https://imagej.net/libs/imagej-common
BSD 2-Clause "Simplified" License
10 stars 18 forks source link

Add IOPlugins for persisting Table objects #64

Closed ctrueden closed 6 years ago

ctrueden commented 8 years ago

After making a big Table object, it is nice to be able to save it, and read it back in later.

And it will make @stelfrich, @bnorthan and probably others happy.

stelfrich commented 8 years ago

Out of curiosity, I did some preliminary work on the table-io branch.

The implementation on this branch was really just a proof-of-principle of how awesome the SciJava stack really is :smile:. We will have to generalize that stuff to also support other formats, e.g. xlsx et al.

There's, however, still a FormatException by SCIFIO:

io.scif.FormatException: /home/stefan/Desktop/testTableIO.csv: No supported format found.
    at io.scif.services.DefaultFormatService.getFormatList(DefaultFormatService.java:351)
    at io.scif.services.DefaultFormatService.getFormat(DefaultFormatService.java:317)
    at io.scif.services.DefaultDatasetIOService.canOpen(DefaultDatasetIOService.java:81)
    at io.scif.io.DatasetIOPlugin.supportsOpen(DatasetIOPlugin.java:65)
    at org.scijava.io.DefaultIOService.getOpener(DefaultIOService.java:66)
    at org.scijava.io.DefaultIOService.open(DefaultIOService.java:85)
    at org.scijava.io.IOService$open$0.call(Unknown Source)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
    at Script1.run(Script1.groovy:19)
    at org.scijava.plugins.scripting.groovy.GroovyScriptEngine.eval(GroovyScriptEngine.java:303)
    at org.scijava.plugins.scripting.groovy.GroovyScriptEngine.eval(GroovyScriptEngine.java:122)
    at org.scijava.plugins.scripting.groovy.GroovyScriptEngine.eval(GroovyScriptEngine.java:114)
    at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:249)
    at org.scijava.script.ScriptModule.run(ScriptModule.java:174)
    at org.scijava.module.ModuleRunner.run(ModuleRunner.java:167)
    at org.scijava.module.ModuleRunner.call(ModuleRunner.java:126)
    at org.scijava.module.ModuleRunner.call(ModuleRunner.java:65)
    at org.scijava.thread.DefaultThreadService$2.call(DefaultThreadService.java:191)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
ctrueden commented 7 years ago

@lnyng Did some work on this; as of this writing, see this code.

lnyng commented 7 years ago

There are also test and a simple benchmark

hadim commented 7 years ago

Seeing the code to save/read Table as CSV in imagej-server I am wondering if it is planned to integrate it to imagej-common so it can be used out-of-the-box into Fiji/SJK.

stelfrich commented 7 years ago

@ctrueden I have (rather naively) moved @lnyng's implementation to table-io. The implementation uses SCIFIO's LocationService which introduces a circular dependency.

Is there a reasonable way to replace the following snippet from DefaultTableIOPlugin.java#L183-L190:

final IRandomAccess handle = locationService.getHandle(source);
if (handle instanceof VirtualHandle) {
    throw new IOException("Cannot open source");
}
handle.seek(0);
final byte[] buffer = new byte[(int) handle.length()];
handle.read(buffer);
final String text = new String(buffer);
ctrueden commented 7 years ago

Is there a reasonable way to replace the following snippet

@stelfrich Yes! We moved all the data handle stuff into SciJava Common. The code you posted is the old API using SCIFIO. With the latest release of SJC, try something like this:

final Location source = new FileLocation(path);
//OR:
//final Location source = new HTTPLocation(uri);
//final Location source = new BytesLocation(byteArray);
final DataHandle handle = dataHandleService.create(source);
final String text = handle.readString(Integer.MAX_VALUE);

The only thing missing at the moment is a way to automatically convert from String to Location. @gab1one and I will probably add converters for this, but it does not yet exist. There are some edge cases: e.g., suppose you have a file in your CWD called google.com—should we convert the string google.com to an HTTPLocation or to a FileLocation?

Edit: Note also that in the future IOPlugin will take Location instead of String for the source, so that "decide which kind of location it is" problem will be moot.

stelfrich commented 7 years ago

The only thing missing at the moment is a way to automatically convert from String to Location.

I assume a FileLocation for now and have added FIXMEs in the code on table-io.

stelfrich commented 7 years ago

I have migrated @lnyng's benchmark to JMH and moved it into its own repository: https://github.com/stelfrich/imagej-common-benchmarks

I'll add my implementation using OpenCSV to the benchmark shortly.

stelfrich commented 7 years ago

I'll add my implementation using OpenCSV to the benchmark shortly.

It's actually using Commons CSV, not OpenCSV:

Benchmark                                                           Mode  Cnt    Score    Error  Units
DefaultTableIOPluginBenchmark.openLargeWithCommonsCSVTableIOPlugin  avgt   10   88.226 ±  0.832  ms/op
DefaultTableIOPluginBenchmark.openLargeWithDefaultTableIOPlugin     avgt   10  100.083 ±  3.423  ms/op
TableIOPluginsSaveBenchmark.writeLargeWithCommonsCSVTableIOPlugin   avgt   10  311.826 ± 12.789  ms/op
TableIOPluginsSaveBenchmark.writeLargeWithDefaultTableIOPlugin      avgt   10  161.235 ±  4.124  ms/op
stelfrich commented 6 years ago

This issue has been resolved by imagej/imagej-plugins-io-table and it's successor scijava/scijava-plugins-io-table.