Open autocyz opened 4 years ago
@autocyz How quickly did you read the file after you changed it? S3 does not guarantee read-after-update consistency (see https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel). If you let less than 2-3 minutes go by between your update and running get_pid() you might have gotten a stale copy.
@Liam3851 If I use code1, it never find the s3 change, but code2 can. I think it's not about s3 update consistency, it's about s3fs memory cache.
Do you think it's an s3fs issue then? It does cache file listings, and the instances of S3FileSystem are cached. You could try S3FS.clear_instance_cache
.
pd.read_csv
default memory_map
seem not to work, if I explicitly set memory_map
to False
, it work
Code Sample
Problem description
when I use
code1
, if s3 file changed,info
not change.code2
can find the change. The difference between these two code is parametermemory_map
, actually, defaultmemory_map
is False, so I was confused.Output of
pd.show_versions()