Closed zoubohao closed 2 months ago
Dear zoubohao, many thanks for pointing out the error! We tried to recreate it, but we cannot make this error occur. It shouldn't be influenced by the number of points. Could you share your data set with us? Maybe you can make it occur with a small subset? We are happy to fix the error and we know the code line where it happens, but we have no idea how to make it occur. Many thanks Melanie
How big is the data set? We tested it with up to 1.000.000 points today but could not reproduce the issue. Anything else that could be special about the data which we could reproduce?
Dear author:
The following is my code:
import pickle
from flspp import FLSpp
def readPickle(readPath: str) -> object:
with open(readPath, "rb") as rh:
obj = pickle.load(rh)
return obj
def foo(X):
print("start cluster", len(X))
cluster_model = FLSpp(8, local_search_iterations=15)
cluster_model = cluster_model.fit(X)
labels_ = cluster_model.labels_
center_points = cluster_model.cluster_centers_
if __name__ == "__main__":
file_path = "../flspp_x_8.pkl"
X = readPickle(file_path)
foo(X)
flspp_x_8.pkl is a pickle file and it is a list that contains vectors. I attached this file with zip compression.
Thank you for replying and solving this. flspp_x_8.zip
We did some debugging over here: I could not reproduce your error message "If this gets printed, the generated number was too big!", however, I did get a segmentation fault when I ran your code.
The issue is that your pickle file holds a float32 array, but we expect float64. Could you try again as this:
...
if __name__ == "__main__":
file_path = "../flspp_x_8.pkl"
X = readPickle(file_path)
X = X.astype(np.float64, copy=False) # new: convert to float64
foo(X)
and report back if that works? :)
I also pushed a fix in commit b9a6f4e that should fix the float32 vs. float64 issue. Does your original code run now?
Thank you very much. It worked!!!
Thank you very much. It worked !!!
Hi authors
I got this message during running and I found that the code was terminated. There are many points in my data. I do not know if there are any issues during running the code
Best