generic_logsumexp with larger point clouds

TobiasMascetta commented 1 year ago

Hi jeanfeydy,

I have one question regarding a strange error that occurs when using default geomloss.SampleLoss() with larger point clouds (>= 25k points):

For smaller point clouds (~2500) everything works fine, but for larger ones, I get the error "name 'generic_logsumexp' is not defined". Now, if I change the feature and the point dimension (so instead of 25k points with 3 features, I now have 3 points with 25k features), everything again works fine and the library is much faster (I assume because EMD vanilla is O(num_points ²)) but the output dimensions are randomly shifted, which is still manageable.

I also read in this issue #52 , that someone had the same problem. I reinstalled pykeops and geomloss but the problem remains.

Do you have some idea what could be the cause and how to fix it so that I can use your implementation for large point clouds without the aforementioned work-arounds?

Best Regards Tobias

TobiasMascetta commented 1 year ago

So I found a work-around, perhaps someone else has the same problems so:

[Work-around]

uninstall geomloss and pykeops
in /home/username/.cache delete the keops cache folders (at your own risk)
install pykeops==2.0 and geomloss
in your main, first import pykeops and run pykeops.test_numpy_bindings() and if your using torch, also pykeops.test_torch_bindings()

[Feature Request]

explicit import errors or warnings for pykeops in the except part of geomloss.

[Background] The source of the error is somewhere in the pykeops library. Essentially, if your number of points exceed a certain threshold (in my case 5000) some pykeops modules get called in geomloss. Unfortunately, these modules are imported in geomloss via try-except, so import errors get silenced and that is why the error only occurs during calling.

pykeops itself has some problems regarding its cache in /home/username/.cache . It seems to me (speculation!) that if you stop the execution of your code it can happen that some pickle file in the cache is not managed correctly, therefore it becomes empty and from then on, the import of pykeops will fail with the error message 'run out of input' (which in geomloss gets silenced, as mentioned).

ZhaiJiaKai commented 3 months ago

所以我找到了一个解决方法，也许其他人也有同样的问题，所以：

[解决方法]

卸载 Geomloss 和 Pykeops

在 /home/username/.cache 中删除 keops 缓存文件夹（风险自负）

安装 pykeops==2.0 和 geomloss

在您的 main 中，首先导入 pykeops 并运行 pykeops.test_numpy_bindings（），如果您使用 torch，还运行 pykeops.test_torch_bindings（）

[功能要求]

geomloss 的 except 部分中 pykeops 的显式导入错误或警告。

[背景]错误的来源位于 pykeops 库中的某个位置。从本质上讲，如果你的点数超过某个阈值（在我的例子中是 5000），一些 pykeops 模块就会被调用 geomloss。不幸的是，这些模块是通过 try-except 导入到 geomloss 中的，因此导入错误会静音，这就是为什么错误仅在调用期间发生的原因。

pykeops 本身在 /home/username/.cache 中的缓存存在一些问题。在我看来（推测！），如果您停止执行代码，可能会发生缓存中的某些 pickle 文件未正确管理的情况，因此它变为空，从那时起，pykeops 的导入将失败，并显示错误消息“输入不足”（如前所述，在 geomloss 中被静音）。

Hello, after following your steps, I still get the above error. How can I solve it?

jeanfeydy / geomloss

generic_logsumexp with larger point clouds #68