xmc-aalto / cascadexml

Code for our paper CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification
7 stars 4 forks source link

Unable to run on Wiki10-31K files #2

Open caseware66 opened 1 year ago

caseware66 commented 1 year ago

max_cluster = max([len(c) for c in clusters[-1]]) self.num_ele = [len(g) for g in clusters] + [params.num_labels]

for i in range(self.num_ele[-2]): clusters[-1][i] = np.pad(clusters[-1][i], (0, max_cluster-len(clusters[-1][i])), constant_values=self.num_ele[-1]).astype(np.int32)

hello,I would like to ask a question in CascadeXML.py line 60-67 1、max_cluster = max([len(c) for c in clusters[-1]])
Should this line of code be modified to max_cluster = max([len(c) for c in clusters[0]])
2、63-65 The purpose of these three lines is? 3、clusters = [np.stack(c) for c in clusters] the line appear error

model = CascadeXML(params = params, train_ds = train_dataset, device = device).to(device)

File "/home/rmzxnlp/project_2023/cascadexml-main/src/CascadeXML.py", line 71, in init clusters = [np.stack(c) for c in clusters] File "/home/rmzxnlp/project_2023/cascadexml-main/src/CascadeXML.py", line 71, in clusters = [np.stack(c) for c in clusters] File "<__array_function__ internals>", line 6, in stack File "/root/miniconda3/envs/cascadexml1/lib/python3.6/site-packages/numpy/core/shape_base.py", line 427, in stack raise ValueError('all input arrays must have the same shape') ValueError: all input arrays must have the same shape