# get the transpose since the most of the functions in implicit expect (user, item) sparse matrices instead of (item, user)
user_plays = artist_user_plays.T.tocsr()
However, it does seem that the binary hdf5 file which is downloaded by the tool was generated with the older version of this code, which is still using the (artist, user) format.
It seems like to reduce confusion it would be a good idea to re-generate the binary hdf5 and remove the transform from the tutorial, or revert the dimension change in the matrix generation step.
Yeah - thats a great callout. The datasets were generated before the API refactor in #481 - and we really should generate new ones with transposed data.
Hi,
In the lastfm tutorial, there is a specific step
However it looks like this may no longer be necessary in some cases. In https://github.com/benfred/implicit/commit/32c06aa669f7597d69c1a9a1c56cf1a1d0c5f1ce#diff-b8a4c78fbfcc629a3d35255010d1a4ae21d5909664b8d3c1283da18359ae5a0aL77-R77 some changes were made which also swapped the order of users/artists when building the sparse matrix. Therefore, if we generate a new copy of the hdf5 from the source data file, the
artist_user_plays
matrix is already in the correct orientation.However, it does seem that the binary hdf5 file which is downloaded by the tool was generated with the older version of this code, which is still using the (artist, user) format.
It seems like to reduce confusion it would be a good idea to re-generate the binary hdf5 and remove the transform from the tutorial, or revert the dimension change in the matrix generation step.