marco-peer / icdar23

Towards Writer Retrieval for Historical Datasets@ICDAR2023 - an unsupervised approach using NetRVLAD and Similarity Graph Reranking
9 stars 2 forks source link

Reproducing code problems #2

Open hwlsrr opened 8 months ago

hwlsrr commented 8 months ago

Hi, I tried the test code and had two problems: 1, after processing the images with pacthes_only, only 14 images are processed and the prediction is made the computer gets stuck (I think the memory exploded), and I can only reduce the number of images to 5. How do you do it in this step? (can run through the test set)

2, Using 5 images as a test set, Best-mAP works poorly, but Best-Top works normally. Not sure where the problem occurs? Best-Top1 0.7144 Best-mAP 0.2093

marco-peer commented 8 months ago
 after processing the images with pacthes_only, only 14 images are processed 

If you run the scripts to extract the patches, all images in the given input directory are processed, so I would recheck your directories.

  I can only reduce the number of images to 5

Could you please give more details about the script you are running? For inference, your machine needs to be able to store the patches of one image at once (for ICDAR19, this are usually a few thousands, this should actually be feasible on a usual machine), otherwise you need to rewrite the script.

  and the prediction is made the computer gets stuck (I think the memory exploded)

I am not quite sure about your hardware (a minimum of computational resources is required of course), please see above.

  Using 5 images as a test set, Best-mAP works poorly, but Best-Top works normally. 

The test set of the ICDAR19 dataset consists of 20k images, so only processing 5 does not make sense and is not comparable to the full set, in particular for retrieval. I did not conduct any experiments on subsets, so I cannot help with that.

hwlsrr commented 8 months ago

If you run the scripts to extract the patches, all images in the given input directory are processed, so I would recheck your directories.

I selected 5 photos in the original dataset and ran the script to extract their patches and then predicted the extracted patches. python extract_patches_only.py --in_dir "D:\Haowl\icdar23-main\data\mpeer\resources\wi_comp_19_test_small" --out_dir "D:\Haowl\icdar23-main\data\mpeer\resources\wi_comp_19_test_patches_num_of_clusters64_small"

Could you please give more details about the script you are running?

When I make a prediction, the code either gets stuck at this step (computer crashes): 25 Fitting PCA done 26 Calculate mAP.. 27 2024-01-27 13:49:54,971 INFO building up NN-Classifier 28 2024-01-27 13:49:55,024 INFO KNN fitting data ((134349, 512) features) 29 2024-01-27 13:49:55,081 INFO using euclidean distance

Or it reports an error for insufficient memory allocation (below): 25 Fitting PCA done 26 Calculate mAP.. 27 2024-01-27 13:49:54,971 INFO building up NN-Classifier 28 2024-01-27 13:49:55,024 INFO KNN fitting data ((134349, 512) features) 29 2024-01-27 13:49:55,081 INFO using euclidean distance 30 Traceback (most recent call last): 31 File "D:\Haowl\icdar23-main\main.py", line 473, in 32 main(config) 33 File "D:\Haowl\icdar23-main\main.py", line 438, in main 34 wtest(model, logger, args, name='Test') 35 File "D:\Haowl\icdar23-main\main.py", line 181, in wtest 36 res, _ = _eval.eval(pfs_tf, writer) 37 File "D:\Haowl\icdar23-main\evaluators\retrieval.py", line 20, in eval 38 distances = self.calc_distances(features, labels, use_precomputed_distances=use_precomputed_distances) 39 File "D:\Haowl\icdar23-main\evaluators\retrieval.py", line 72, in calc_distances 40 distances = self.compute_distances(features) 41 File "D:\Haowl\icdar23-main\evaluators\retrieval.py", line 222, in compute_distances 42 self._distances = pairwise_distances(features, metric='euclidean', n_jobs=15) 43 File "D:\Anaconda\envs\d2l\lib\site-packages\sklearn\metrics\pairwise.py", line 2022, in pairwise_distances 44 return _parallel_pairwise(X, Y, func, n_jobs, kwds) 45 File "D:\Anaconda\envs\d2l\lib\site-packages\sklearn\metrics\pairwise.py", line 1568, in _parallel_pairwise 46 Parallel(backend="threading", n_jobs=n_jobs)( 47 File "D:\Anaconda\envs\d2l\lib\site-packages\joblib\parallel.py", line 1061, in call 48 self.retrieve() 49 File "D:\Anaconda\envs\d2l\lib\site-packages\joblib\parallel.py", line 938, in retrieve 50 self._output.extend(job.get(timeout=self.timeout)) 51 File "D:\Anaconda\envs\d2l\lib\multiprocessing\pool.py", line 771, in get 52 raise self._value 53 File "D:\Anaconda\envs\d2l\lib\multiprocessing\pool.py", line 125, in worker 54 result = (True, func(*args, *kwds)) 55 File "D:\Anaconda\envs\d2l\lib\site-packages\joblib_parallel_backends.py", line 595, in call 56 return self.func(args, kwargs) 57 File "D:\Anaconda\envs\d2l\lib\site-packages\joblib\parallel.py", line 263, in call 58 return [func(*args, kwargs) 59 File "D:\Anaconda\envs\d2l\lib\site-packages\joblib\parallel.py", line 263, in 60 return [func(*args, *kwargs) 61 File "D:\Anaconda\envs\d2l\lib\site-packages\sklearn\utils\fixes.py", line 117, in call 62 return self.function(args, kwargs) 63 File "D:\Anaconda\envs\d2l\lib\site-packages\sklearn\metrics\pairwise.py", line 1551, in _dist_wrapper 64 distmatrix[:, slice] = dist_func(*args, *kwargs) 65 File "D:\Anaconda\envs\d2l\lib\site-packages\sklearn\metrics\pairwise.py", line 328, in euclidean_distances 66 return _euclidean_distances(X, Y, X_norm_squared, Y_norm_squared, squared) 67 File "D:\Anaconda\envs\d2l\lib\site-packages\sklearn\metrics\pairwise.py", line 366, in _euclidean_distances 68 distances = _euclidean_distances_upcast(X, XX, Y, YY) 69 File "D:\Anaconda\envs\d2l\lib\site-packages\sklearn\metrics\pairwise.py", line 568, in _euclidean_distances_upcast 70 d = -2 safe_sparse_dot(X_chunk, Y_chunk.T, dense_output=True) 71 numpy.core._exceptions._ArrayMemoryError: Unable to allocate 738. MiB for an array with shape (10798, 8957) and data type float64

I am not quite sure about your hardware (a minimum of computational resources is required of course)

My computer configuration is RTX3090 + RAM 32GB. and after I processed 16 images using extract_patches_only.py, the patches are only 100MB in total. so I'm wondering why I'm getting the memory explosion error.

Thank you very much for your guidance.

marco-peer commented 8 months ago
27 2024-01-27 13:49:54,971 INFO building up NN-Classifier
28 2024-01-27 13:49:55,024 INFO KNN fitting data ((134349, 512) features)
29 2024-01-27 13:49:55,081 INFO using euclidean distance

The issue seems to be that you do not perform aggregation on the patches, e.g. just run np.mean on all patches of the same page (or use a snippet of our repository).

If you have five pages, the KNN should deal with an array of (5, dim). The calculation of the distance matrix (in your case (134349,134349) does not fit in your memory).

Hope that helps.

hwlsrr commented 8 months ago

The issue seems to be that you do not perform aggregation on the patches, e.g. just run np.mean

I see what you mean.

or use a snippet of our repository

What part of the code are you referring to as "a snippet from our repository"? I am using your code directly without modification. I added a function ‘aggregate_patches’ based on your suggestions for page aggregation. However, when executing the calc_map_from_distances function in your retrieval.py file, an error occurs. The issue arises when authors have only one page after page aggregation, resulting in a division by zero error. Because both top1_correct_count and top1_wrong_count are zero. top1_precsision = top1_correct_count / float(top1_correct_count + top1_wrong_count)

def aggregate_patches(self, features, labels):
        unique_labels = np.unique(labels)
        aggregated_features = []
        aggregated_labels = [] 

        for label in unique_labels:
            label_indices = np.where(labels == label)[0]
            label_features = features[label_indices]
            mean_feature = np.mean(label_features, axis=0)
            aggregated_features.append(mean_feature)
            aggregated_labels.append(label)

        aggregated_features = np.array(aggregated_features)
        aggregated_labels = np.array(aggregated_labels)
        return aggregated_features, aggregated_labels