s3fs-fuse / s3fs-fuse

FUSE-based file system backed by Amazon S3
GNU General Public License v2.0
8.61k stars 1.02k forks source link

all my file contents change to ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ #428

Closed Fei-Guang closed 7 years ago

Fei-Guang commented 8 years ago

i used 1.79 and started s3fs as
$s3fs mybucket /opt/s3fs -o nomultipart,use_cache=/tmp. last week , everything seems ok,

the size of under /tmp/ is about 30G but today ,i removed all cache under /tmp and restart my ubuntu system. then remount /opt/s3fs, every file contents have changed .

$vi /opt/s3fs/config.txt

^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ~ ~ ~ ~ ~ ~ ~ ~ ~ "config.txt" [noeol] 1L, 509C

jamessoubry commented 8 years ago

We are getting the same issue with 1.79, 1.80 but not 1.78 It seems to only corrupt plane text files when copying to the cache directory

Fei-Guang commented 8 years ago

do you know why the text file changed to binary file?

jamessoubry commented 8 years ago

not sure yet. i'm reviewing the diff between 1.78 and 1.79 to try and work out what has broken it but theres a lot of code to go through.

ggtakec commented 8 years ago

Hi, @Fei-Guang The cache file made by s3fs is the contents of the object(file) obtained in blocks. Range that has not yet been GET does not exist in the cache file and it looks to all non-download area is 0x00.

Maybe, you see those cache file directly, and I think that looking at a range that has not yet been GET.

Regards,

jamessoubry commented 8 years ago

Hi, @ggtakec @Fei-Guang, I have done extensive testing and have worked out, for us, the issue happens when we delete items from the cache folder without deleting the .stat folder. we originally had a puppet script that deleted items held in the cache for a long time to reduce the cache folder size. It seems this is no longer possible. @Fei-Guang can you test if this is the same issue with you?

ggtakec commented 8 years ago

Hi, @jamessoubry As you said, I was able to reproduce the bug after I deleted the cache file without deleting .stat file. Since the bug occurs when it is deleted during the upload / download of files, I need the time to fix this. Please wait a little while.

And thanks for your assistance.

ggtakec commented 8 years ago

@jamessoubry I merged #444, it fixed about removing cache files during upload/download a object. Please try to use it. Thanks in advance for your help.

jamessoubry commented 8 years ago

@ggtakec Thanks for the help. I will test and get back to you

jamessoubry commented 8 years ago

@ggtakec fyi Ive not got round to testing this yet. will hopefully do it soon

ggtakec commented 7 years ago

@jamessoubry I merged codes to master branch for fixing #435 bug which is very similar to this issue. I fixed this issue with #444, but there may still have been defects. But now I merged codes by #511, I think that this issue is solved.

I'm going to close this issue, but please reopen or post new issue if the problem continues.

Thanks in advance for your help.

ayush-san commented 4 years ago

I am using s3fs version 1.85 and facing this issue in only some files where some content is changed to ^@^@^@^@^@^@^@^@^@^@

    def _chain_tasks(self,
                     sqoop_import_task: SqoopImportOperator,
                     process_and_replicate_task: ProcessAndReplicateOperator,
                     table_cleanup_task: TableCleanupOperator,
                     hive_msck_repair_task: Optional[HiveMsckRepairOperator]) -> None:
        sqoop_import_task >> process_and_replicate_task >> table_cleanup_task
        if hive_msck_repair_task:
            proces^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@

I am using s3fs entry in fstab to mount my s3 folder

s3fs#BUCKET_NAME:/airflow/ /mnt/airflow fuse _netdev,iam_role=auto,umask=0022,uid=0,gid=0,allow_other

It was working fine but with the new deployment, some files content is getting changed to ^@^@^@^@^@^@^@^@^@^@.

By unmounting and mounting again and restarting the process fixed the issue, but I don't think it's a correct way to proceed

gaul commented 4 years ago

Please test with master and open a new issue if your symptoms persist.