enoche / MMRec

A Toolbox for MultiModal Recommendation. Integrating 10+ Models...
GNU General Public License v3.0
367 stars 46 forks source link

Problem about dataset process #20

Closed wenyuzzz closed 1 year ago

wenyuzzz commented 1 year ago

Thank you very much for your excellent work.

When I process the dataset I have a problem about 0rating2inter.ipynb, Block13 should we resort to this data by timestamp, before this code:

split_timestamps = list(np.quantile(df[ts_id], split_ratios))

wenyuzzz commented 1 year ago

I have tested this code with an example; this function has sorted these data.


import numpy as np

# Sequence
data = np.array([60, 70, 87, 56, 35, 64, 28, 84, 89, 65])

# Calculate quantiles
q25 = np.quantile(data, 0.25)
q50 = np.quantile(data, 0.5)
q75 = np.quantile(data, 0.75)

# Split data based on quantiles
data_q1 = data[data <= q25]
data_q2 = data[(data > q25) & (data <= q50)]
data_q3 = data[(data > q50) & (data <= q75)]
data_q4 = data[data > q75]

print("First quartile data:", data_q1)
print("Second quartile data:", data_q2)
print("Third quartile data:", data_q3)
print("Fourth quartile data:", data_q4)