Closed Timshel closed 3 years ago
Thank you for the valuable feedback! New examples helped me to catch a bug you'd mentioned in the previous issue (about empty time) as well. I reproduced all the issues - let me go through them one by one (master branch at commit a3b35da08 includes all the fiixes):
listing with s3cmd the bucket is not visible and if I try to list it will fail with a 404 on ais://aistore-test
. This behavior is expected - our S3-compatibility layer provides access only to AIS buckets that may (or may not) be configured with a Cloud backend.
Empty LastModified
. AIStore currently does not track object creation times. We track (or rather, cache) access times, and only for the purposes of optimizing storage capacity when running out of space. In object-list responses, AIS returns the last access time.
That is why never-accessed objects have an empty LastModification
time. To support clients like s3cmd
AIS will now return a zero Unix time indicating that the object exists is in the Cloud but AIS has never downloaded it (please, see an example in the updated documentation.
page size exceeds the maximum value (got: 10000, max expected: 1000)
Good catch, thank you! The bug, already fixed in the master, was related to having AIS bucket configured with a backend Cloud bucket.
When trying to push a file using with
s3cmdit failed after multiple retry due to invalid checksum
Here again, an AIS bucket with a Cloud backend went a different path returning MD5 checksum in the response body and leaving the response header empty. Fixed in the master and must work fine when ais set props ais://test checksum.type=md5
is set for the bucket.
Hi,
Even if for now I'll pause my exploration of
aistore
, wanted to give some feedback on the issues I encountered when testing (using https://github.com/NVIDIA/aistore/commit/9ec804e2a15992d2813ca07268d48fb88cca44d0).After activating the aws cloud backend, I started first with the direct access. I was able to list the bucket content (after getting trolled a bit since I was too restrictive on my bucket policy and needed to add more than just
List*
foraistore
to read bucket metadata I believe).Two things that were worrisome are the missing
ATIME
when the file is not cached and the different checksum depending if the file is cached or not (it's the same file with a different name). But what I did not realized was that the bucket is not accessible through the s3 endpoint, when listing withs3cmd
the bucket is not visible and if I try to list it it will fail with a 404 onais://aistore-test
.So I switched to creating an ais bucket with an aws backend :
First issue I had was with default max page size, the
target
was outputting errors such as :page size exceeds the maximum value (got: 10000, max expected: 1000)
. I believed it's due to the default forais
being higher than the one foraws
. There is probably a clean way to do it but I just changed it here https://github.com/NVIDIA/aistore/blob/master/cmn/api_const.go#L280 :).After this I encountered the same missing
ATIME
when the file is not cached which preventeds3cmd
from listing the bucket. After loading each file in cache thels
returned but all files were listed twice.When trying to push a file using with
s3cmd
it failed after multiple retry due to invalid checksum (file is still uploaded), when setting thechecksum.type
props it appears to have no effect on the file checksum (tried setting it before and after settingbackend_bck
, did not check withtcpdump
if theETag
was present).When fetching a file with
s3cmd
if it's not cached then it will fail first due to missingATIME
, then after it's loaded in cache it will download the file but output and error due to an invalid checksum.Edit: Additionally I did two upload test on a standard
ais
bucket.MLFlow
python client which rely onbotocore
and unsurprisingly failed with the same error asawscli
(Could not connect to the endpoint URL: "http://s3.ais.amazonaws.com)Thank you and have a nice day.