Closed kazurayam closed 10 months ago
I should try mounting S3 as a local file system. I should check if materialstore works with it.
I think that the materialstore library do not need to worry about making IO to AWS S3 directly.
We can map S3 bucket to a local directory using tools like Cyberduck.
You can try using Amazon AWS S3 FileSystem Provider JSR-203 for Java 7 (NIO2) also known as S3FS.
https://stackoverflow.com/questions/41113119/java-nio-file-implementation-for-aws
I found that the com.kazurayam.materialstore.core.filesystem.StoreOnS3Test
runs slow.
<?xml version="1.0" encoding="UTF-8"?>
<testsuite name="com.kazurayam.materialstore.core.filesystem.StoreOnS3Test" tests="3" skipped="0" failures="0" errors="0" timestamp="2023-01-08T11:44:35" hostname="KAZUAKInoMacBook-Air-2.local"
time="27.354">
<properties/>
<testcase name="testCreateStore()" classname="com.kazurayam.materialstore.core.filesystem.StoreOnS3Test"
time="15.52"/>
<testcase name="testS3fs()" classname="com.kazurayam.materialstore.core.filesystem.StoreOnS3Test"
time="11.833"/>
<testcase name="testNewInstanceOnAwsS3()" classname="com.kazurayam.materialstore.core.filesystem.StoreOnS3Test" time="0.001"/>
<system-out><![CDATA[]]></system-out>
<system-err><![CDATA[[Test worker] INFO com.kazurayam.materialstore.core.filesystem.Material - root.getClass().toString()=class com.upplication.s3fs.S3Path
]]></system-err>
</testsuite>
I should study what is the bottleneck of speed in more detail .
I made the com.kazurayam.materialstore.core.filesystem.SotreOnS3Test class to do performance measurement using the Timekeeper library.
It took 28 seconds to execute the testS3fs()
method.
The Timekeeper emitited the following measurement result.
Step | duration | graph |
---|---|---|
creating new FileSystem on S3 | 00:02 | # |
creating parent dir | 00:00 | # |
writing a file | 00:01 | # |
listing a dir | 00:00 | # |
deleting a file | 00:01 | # |
deleting a dir | 00:02 | # |
closing the FileSystem | 00:00 | # |
Average | 00:01 |
The test method took 6+α seconds with the JUnit runner reported it took 28 seconds.
It seems that the JUnit took around 20 senconds to carry out the test. The overhead by JUnit seems to be heavy; heavier than I guessed.
This implies that the performance of s3fs is good enough; not so slow as I was worried about.
supported at version 0.15.0
This issue has not been finshed yet.
I reviewed this issue because the issue #437 required me to do so.
As of the latest v0.16.6
FileSystemFactory.newFileSystem()
method returns an instance of java.nio.file.FileSystem
. The returned value could either be the default FileSystem, or the S3FS. You can control which type of FileSystem to obtain.
"s3fs.uri"
with value a URI string that represents one of the published S3 Endpoints. For example, "s3://s3.ap-northeast-1.AMAZONAWS.COM"
. java.nio.file.Paths
classcom.kazurayam.materialstore.core.FileSystemFactory
yet. For example, look at com.kazurayam.materialstore.base.report.IndexCreator. It has the following fragment:
public String makeTitle(Map<String, Object> model) {
String s = (String)model.get("store");
Path parent = Paths.get(s);
return parent.getFileName().toString() + "/index.html";
}
This section must be changed as follows so that it uses the FileSystem
instanciated by the FileSystemFactory
:
public String makeTitle(Map<String, Object> model) {
String s = (String)model.get("store");
Path parent = FileSystemFactory.newFileSystem().get(s);
return parent.getFileName().toString() + "/index.html";
}
In other words, I should no longer use java.nio.file.Paths.get(String)
method. Instead, I should use FileSystemFactory.newFileSystem().get(s)
. In the project, there are so many sections where I called Paths.get(String)
. I need to modify all of these sections.
implementation libs.s3fs
I can change this to:
testImplementation libs.s3fs
This change will not break the existing codes at all because the existing materialstore library just utilized the Service Provider Interface of FileSystem.
have changed to testImplementation libs.s3fs
Let me imagine that i want to develop an application on top of the "materialstore" on AWS. I would find it necessary to create a "store" on AWS S3.
How can I make it possible?
I want to develop an application runnable on EC2 with backing S3.