Closed NickSale closed 4 years ago
Thanks @NickSale! I'm investigating whether this is a bug of giotto-tda
or of joblib
. Are you able to reproduce the issue with "nontrivial" input data? In your examples, the input persistence diagrams are all trivial (birth=death). When I make a non-trivial example (e.g. adding 1
in the second column) I do not see the issue.
@NickSale actually, let me correct that: when I change the line
X = np.linspace(0, 100, 300000)
for
X = np.linspace(0, 100)
or
X = np.random.random(100)
then things work out (regardless of triviality).
So indeed data size seems to be the issue here.
Hi @ulupo - I originally ran across this bug when working with a large collection of non-trivial diagrams coming from a project I'm working on. I managed to figure out what was happening and you might have seen that I've just submitted a pull request which should fix it, but it seems to be that joblib is working as expected, just that its automatic memory mapping functionality hadn't been accounted for in the giotto code.
Hi, yes I'm looking at #428 now, thank you very much for it! At first glance it looks like your diagnosis and fix are on point.
Describe the bug
"ValueError: assignment destination is read-only" error thrown when calling transform (or fit_transform) on PersistenceImage or HeatKernel with n_jobs > 1 and diagrams sufficiently large
To reproduce
Gist HeatKernel: https://gist.github.com/NickSale/30d9ae30b2d99b3d62700a4a92907422 Gist PersistenceImage: https://gist.github.com/NickSale/403ba2218de93fc9dbefb77e1ab9d0f3
Expected behavior
No error should be thrown.
Actual behaviour
HeatKernel Gist: https://gist.github.com/NickSale/9abdc2919e2c0cb941dec1af5228d5a6 PersistenceImage Gist: https://gist.github.com/NickSale/3d00e0e765b849ffb84592815158fe5a
Versions
Windows-10-10.0.18362-SP0 Python 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)] NumPy 1.18.5 SciPy 1.4.1 Joblib 0.13.2 Scikit-learn 0.22.2.post1 Giotto-tda 0.2.1
Additional context