Closed sisdanghoang closed 4 months ago
@sisdanghoang Hi, could you test if the same bug happens in the official faiss conda package? This repository is merely a thin packaging workflow and cannot handle a bugfix to the underlying faiss functionality.
@kyamagu Hi, Thank you for your suggestion. Unfortunately, I’m running a Windows system, and the official conda package for FAISS is not compatible with Windows. As a result, I can only install the faiss-cpu version. This repository serves as a packaging workflow, and I understand that it cannot directly address underlying FAISS functionality issues.
I’m currently facing an issue with FAISS’s save_local and load_local functions when using Japanese folder names. The operations fail to execute properly, which seems to be an encoding-related problem. If you or anyone else in the community has encountered this issue and found a solution, I would greatly appreciate any advice or suggestions.
The error is likely happening at the IO method of the upstream faiss library. https://github.com/facebookresearch/faiss/blob/6e7d9e040f9be9734277c3f27b2cb364a67f442d/faiss/impl/io.cpp#L66
I'm not familiar with how Windows handles unicode filename, but you'd need to fix the upstream implementation. https://learn.microsoft.com/cpp/c-runtime-library/reference/fopen-wfopen
@kyamagu , Thank you so much for your answer. I have found the solution.
I just added the following code and it can work correctly now.
import locale locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
Describe the bug When attempting to save or load indexes using FAISS's
save_local
andload_local
functions with Japanese folder names, the operations fail. This issue occurs despite the filesystem and Python script being properly configured to handle UTF-8 encoding.To Reproduce Steps to reproduce the behavior:
save_local
andload_local
:save_local
.load_local
.Expected behavior The expected behavior is that FAISS should be able to save to and load from folders with Japanese names without any issues, given that the system and script are correctly set up to support UTF-8.
Desktop:
Additional context This issue may be related to how FAISS handles non-ASCII characters in file paths. It is crucial for users working with non-English file systems to have full functionality with FAISS operations.