Closed kamalkraj closed 1 week ago
410M
2024-06-24 03:43:32 - Loading Corpus...
100%|███████████████████████████████████████████████████████████████████████████████████████████| 5183/5183 [00:00<00:00, 173629.26it/s]
2024-06-24 03:43:32 - Loaded 5183 TEST Documents.
2024-06-24 03:43:32 - Doc Example: {'text': 'Alterations of the architecture of cerebral white matter in the developing human brain can affect cortical development and result in functional disabilities. A line scan diffusion-weighted magnetic resonance imaging (MRI) sequence with diffusion tensor analysis was applied to measure the apparent diffusion coefficient, to calculate relative anisotropy, and to delineate three-dimensional fiber architecture in cerebral white matter in preterm (n = 17) and full-term infants (n = 7). To assess effects of prematurity on cerebral white matter development, early gestation preterm infants (n = 10) were studied a second time at term. In the central white matter the mean apparent diffusion coefficient at 28 wk was high, 1.8 microm2/ms, and decreased toward term to 1.2 microm2/ms. In the posterior limb of the internal capsule, the mean apparent diffusion coefficients at both times were similar (1.2 versus 1.1 microm2/ms). Relative anisotropy was higher the closer birth was to term with greater absolute values in the internal capsule than in the central white matter. Preterm infants at term showed higher mean diffusion coefficients in the central white matter (1.4 +/- 0.24 versus 1.15 +/- 0.09 microm2/ms, p = 0.016) and lower relative anisotropy in both areas compared with full-term infants (white matter, 10.9 +/- 0.6 versus 22.9 +/- 3.0%, p = 0.001; internal capsule, 24.0 +/- 4.44 versus 33.1 +/- 0.6% p = 0.006). Nonmyelinated fibers in the corpus callosum were visible by diffusion tensor MRI as early as 28 wk; full-term and preterm infants at term showed marked differences in white matter fiber organization. The data indicate that quantitative assessment of water diffusion by diffusion tensor MRI provides insight into microstructural development in cerebral white matter in living infants.', 'title': 'Microstructural development of human newborn cerebral white matter assessed in vivo by diffusion tensor magnetic resonance imaging.'}
2024-06-24 03:43:32 - Loading Queries...
2024-06-24 03:43:32 - Loaded 300 TEST Queries.
2024-06-24 03:43:32 - Query Example: 0-dimensional biomaterials show inductive properties.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-06-24 03:43:42 - Encoding Queries...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:07<00:00, 40.87it/s]
2024-06-24 03:43:50 - Sorting Corpus by document length (Longest first)...
2024-06-24 03:43:50 - Encoding Corpus in batches... Warning: This might take a while!
2024-06-24 03:43:50 - Scoring Function: Dot Product (dot)
2024-06-24 03:43:50 - Encoding Batch 1/1...
100%|████████████████████████████████████████████████████████████████████████████████████████████████| 5183/5183 [02:03<00:00, 42.09it/s]
2024-06-24 03:45:53 - For evaluation, we ignore identical query and document ids (default), please explicitly set ``ignore_identical_ids=False`` to ignore this.
2024-06-24 03:45:53 -
2024-06-24 03:45:53 - NDCG@1: 0.3967
2024-06-24 03:45:53 - NDCG@3: 0.4353
2024-06-24 03:45:53 - NDCG@5: 0.4560
2024-06-24 03:45:53 - NDCG@10: 0.4717
2024-06-24 03:45:53 - NDCG@100: 0.5002
2024-06-24 03:45:53 - NDCG@1000: 0.5204
2024-06-24 03:45:53 -
2024-06-24 03:45:53 - MAP@1: 0.3807
2024-06-24 03:45:53 - MAP@3: 0.4191
2024-06-24 03:45:53 - MAP@5: 0.4315
2024-06-24 03:45:53 - MAP@10: 0.4386
2024-06-24 03:45:53 - MAP@100: 0.4448
2024-06-24 03:45:53 - MAP@1000: 0.4455
2024-06-24 03:45:53 -
2024-06-24 03:45:53 - Recall@1: 0.3807
2024-06-24 03:45:53 - Recall@3: 0.4634
2024-06-24 03:45:53 - Recall@5: 0.5137
2024-06-24 03:45:53 - Recall@10: 0.5603
2024-06-24 03:45:53 - Recall@100: 0.6879
2024-06-24 03:45:53 - Recall@1000: 0.8497
2024-06-24 03:45:53 -
2024-06-24 03:45:53 - P@1: 0.3967
2024-06-24 03:45:53 - P@3: 0.1678
2024-06-24 03:45:53 - P@5: 0.1140
2024-06-24 03:45:53 - P@10: 0.0637
2024-06-24 03:45:53 - P@100: 0.0079
2024-06-24 03:45:53 - P@1000: 0.0010
1B
2024-06-24 03:46:49 - Loading Corpus...
100%|████████████████████████████████████████████████████████████████████████████████████████████| 5183/5183 [00:00<00:00, 176718.92it/s]
2024-06-24 03:46:49 - Loaded 5183 TEST Documents.
2024-06-24 03:46:49 - Doc Example: {'text': 'Alterations of the architecture of cerebral white matter in the developing human brain can affect cortical development and result in functional disabilities. A line scan diffusion-weighted magnetic resonance imaging (MRI) sequence with diffusion tensor analysis was applied to measure the apparent diffusion coefficient, to calculate relative anisotropy, and to delineate three-dimensional fiber architecture in cerebral white matter in preterm (n = 17) and full-term infants (n = 7). To assess effects of prematurity on cerebral white matter development, early gestation preterm infants (n = 10) were studied a second time at term. In the central white matter the mean apparent diffusion coefficient at 28 wk was high, 1.8 microm2/ms, and decreased toward term to 1.2 microm2/ms. In the posterior limb of the internal capsule, the mean apparent diffusion coefficients at both times were similar (1.2 versus 1.1 microm2/ms). Relative anisotropy was higher the closer birth was to term with greater absolute values in the internal capsule than in the central white matter. Preterm infants at term showed higher mean diffusion coefficients in the central white matter (1.4 +/- 0.24 versus 1.15 +/- 0.09 microm2/ms, p = 0.016) and lower relative anisotropy in both areas compared with full-term infants (white matter, 10.9 +/- 0.6 versus 22.9 +/- 3.0%, p = 0.001; internal capsule, 24.0 +/- 4.44 versus 33.1 +/- 0.6% p = 0.006). Nonmyelinated fibers in the corpus callosum were visible by diffusion tensor MRI as early as 28 wk; full-term and preterm infants at term showed marked differences in white matter fiber organization. The data indicate that quantitative assessment of water diffusion by diffusion tensor MRI provides insight into microstructural development in cerebral white matter in living infants.', 'title': 'Microstructural development of human newborn cerebral white matter assessed in vivo by diffusion tensor magnetic resonance imaging.'}
2024-06-24 03:46:49 - Loading Queries...
2024-06-24 03:46:49 - Loaded 300 TEST Queries.
2024-06-24 03:46:49 - Query Example: 0-dimensional biomaterials show inductive properties.
config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████| 753/753 [00:00<00:00, 241kB/s]
model.safetensors: 100%|████████████████████████████████████████████████████████████████████████████| 3.64G/3.64G [01:06<00:00, 54.9MB/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████| 4.79k/4.79k [00:00<00:00, 2.56MB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████████| 2.31M/2.31M [00:00<00:00, 10.3MB/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-06-24 03:47:57 - Encoding Queries...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:05<00:00, 59.54it/s]
2024-06-24 03:48:02 - Sorting Corpus by document length (Longest first)...
2024-06-24 03:48:02 - Encoding Corpus in batches... Warning: This might take a while!
2024-06-24 03:48:02 - Scoring Function: Dot Product (dot)
2024-06-24 03:48:02 - Encoding Batch 1/1...
100%|████████████████████████████████████████████████████████████████████████████████████████████████| 5183/5183 [01:40<00:00, 51.59it/s]
2024-06-24 03:49:43 - For evaluation, we ignore identical query and document ids (default), please explicitly set ``ignore_identical_ids=False`` to ignore this.
2024-06-24 03:49:43 -
2024-06-24 03:49:43 - NDCG@1: 0.0100
2024-06-24 03:49:43 - NDCG@3: 0.0221
2024-06-24 03:49:43 - NDCG@5: 0.0285
2024-06-24 03:49:43 - NDCG@10: 0.0402
2024-06-24 03:49:43 - NDCG@100: 0.0957
2024-06-24 03:49:43 - NDCG@1000: 0.1494
2024-06-24 03:49:43 -
2024-06-24 03:49:43 - MAP@1: 0.0100
2024-06-24 03:49:43 - MAP@3: 0.0183
2024-06-24 03:49:43 - MAP@5: 0.0218
2024-06-24 03:49:43 - MAP@10: 0.0268
2024-06-24 03:49:43 - MAP@100: 0.0365
2024-06-24 03:49:43 - MAP@1000: 0.0382
2024-06-24 03:49:43 -
2024-06-24 03:49:43 - Recall@1: 0.0100
2024-06-24 03:49:43 - Recall@3: 0.0333
2024-06-24 03:49:43 - Recall@5: 0.0483
2024-06-24 03:49:43 - Recall@10: 0.0828
2024-06-24 03:49:43 - Recall@100: 0.3560
2024-06-24 03:49:43 - Recall@1000: 0.7876
2024-06-24 03:49:43 -
2024-06-24 03:49:43 - P@1: 0.0100
2024-06-24 03:49:43 - P@3: 0.0111
2024-06-24 03:49:43 - P@5: 0.0100
2024-06-24 03:49:43 - P@10: 0.0087
2024-06-24 03:49:43 - P@100: 0.0039
2024-06-24 03:49:43 - P@1000: 0.0009
2B
2024-06-23 11:06:57 - Loading Corpus...
100%|████████████████████████████████████████████████████████████████████████████████| 5183/5183 [00:00<00:00, 174244.38it/s]
2024-06-23 11:06:57 - Loaded 5183 TEST Documents.
2024-06-23 11:06:57 - Doc Example: {'text': 'Alterations of the architecture of cerebral white matter in the developing human brain can affect cortical development and result in functional disabilities. A line scan diffusion-weighted magnetic resonance imaging (MRI) sequence with diffusion tensor analysis was applied to measure the apparent diffusion coefficient, to calculate relative anisotropy, and to delineate three-dimensional fiber architecture in cerebral white matter in preterm (n = 17) and full-term infants (n = 7). To assess effects of prematurity on cerebral white matter development, early gestation preterm infants (n = 10) were studied a second time at term. In the central white matter the mean apparent diffusion coefficient at 28 wk was high, 1.8 microm2/ms, and decreased toward term to 1.2 microm2/ms. In the posterior limb of the internal capsule, the mean apparent diffusion coefficients at both times were similar (1.2 versus 1.1 microm2/ms). Relative anisotropy was higher the closer birth was to term with greater absolute values in the internal capsule than in the central white matter. Preterm infants at term showed higher mean diffusion coefficients in the central white matter (1.4 +/- 0.24 versus 1.15 +/- 0.09 microm2/ms, p = 0.016) and lower relative anisotropy in both areas compared with full-term infants (white matter, 10.9 +/- 0.6 versus 22.9 +/- 3.0%, p = 0.001; internal capsule, 24.0 +/- 4.44 versus 33.1 +/- 0.6% p = 0.006). Nonmyelinated fibers in the corpus callosum were visible by diffusion tensor MRI as early as 28 wk; full-term and preterm infants at term showed marked differences in white matter fiber organization. The data indicate that quantitative assessment of water diffusion by diffusion tensor MRI provides insight into microstructural development in cerebral white matter in living infants.', 'title': 'Microstructural development of human newborn cerebral white matter assessed in vivo by diffusion tensor magnetic resonance imaging.'}
2024-06-23 11:06:57 - Loading Queries...
2024-06-23 11:06:57 - Loaded 300 TEST Queries.
2024-06-23 11:06:57 - Query Example: 0-dimensional biomaterials show inductive properties.
config.json: 100%|███████████████████████████████████████████████████████████████████████████| 644/644 [00:00<00:00, 212kB/s]
model.safetensors.index.json: 100%|█████████████████████████████████████████████████████| 12.5k/12.5k [00:00<00:00, 5.44MB/s]
model-00001-of-00003.safetensors: 100%|█████████████████████████████████████████████████| 4.91G/4.91G [01:13<00:00, 66.5MB/s]
model-00002-of-00003.safetensors: 100%|█████████████████████████████████████████████████| 4.98G/4.98G [01:13<00:00, 68.0MB/s]
model-00003-of-00003.safetensors: 100%|███████████████████████████████████████████████████| 134M/134M [00:02<00:00, 64.2MB/s]
Downloading shards: 100%|██████████████████████████████████████████████████████████████████████| 3/3 [02:29<00:00, 49.98s/it]
`config.hidden_act` is ignored, you should use `config.hidden_activation` instead.
Gemma's activation function will be set to `gelu_pytorch_tanh`. Please, use
`config.hidden_activation` if you want to override this behaviour.
See https://github.com/huggingface/transformers/pull/29402 for more details.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 1.87it/s]
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████| 1.11k/1.11k [00:00<00:00, 517kB/s]
tokenizer.model: 100%|██████████████████████████████████████████████████████████████████| 4.24M/4.24M [00:00<00:00, 36.2MB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████████████████████| 17.5M/17.5M [00:00<00:00, 84.0MB/s]
2024-06-23 11:09:31 - Encoding Queries...
100%|██████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:15<00:00, 19.78it/s]
2024-06-23 11:09:46 - Sorting Corpus by document length (Longest first)...
2024-06-23 11:09:46 - Encoding Corpus in batches... Warning: This might take a while!
2024-06-23 11:09:46 - Scoring Function: Dot Product (dot)
2024-06-23 11:09:46 - Encoding Batch 1/1...
100%|████████████████████████████████████████████████████████████████████████████████████| 5183/5183 [13:54<00:00, 6.21it/s]
2024-06-23 11:23:40 - For evaluation, we ignore identical query and document ids (default), please explicitly set ``ignore_identical_ids=False`` to ignore this.
2024-06-23 11:23:41 -
2024-06-23 11:23:41 - NDCG@1: 0.2633
2024-06-23 11:23:41 - NDCG@3: 0.3229
2024-06-23 11:23:41 - NDCG@5: 0.3487
2024-06-23 11:23:41 - NDCG@10: 0.3686
2024-06-23 11:23:41 - NDCG@100: 0.4096
2024-06-23 11:23:41 - NDCG@1000: 0.4257
2024-06-23 11:23:41 -
2024-06-23 11:23:41 - MAP@1: 0.2434
2024-06-23 11:23:41 - MAP@3: 0.2991
2024-06-23 11:23:41 - MAP@5: 0.3143
2024-06-23 11:23:41 - MAP@10: 0.3227
2024-06-23 11:23:41 - MAP@100: 0.3308
2024-06-23 11:23:41 - MAP@1000: 0.3315
2024-06-23 11:23:41 -
2024-06-23 11:23:41 - Recall@1: 0.2434
2024-06-23 11:23:41 - Recall@3: 0.3691
2024-06-23 11:23:41 - Recall@5: 0.4318
2024-06-23 11:23:41 - Recall@10: 0.4928
2024-06-23 11:23:41 - Recall@100: 0.6893
2024-06-23 11:23:41 - Recall@1000: 0.8150
2024-06-23 11:23:41 -
2024-06-23 11:23:41 - P@1: 0.2633
2024-06-23 11:23:41 - P@3: 0.1344
2024-06-23 11:23:41 - P@5: 0.0973
2024-06-23 11:23:41 - P@10: 0.0557
2024-06-23 11:23:41 - P@100: 0.0079
2024-06-23 11:23:41 - P@1000: 0.0009
7B
2024-06-24 03:52:37 - Loaded 5183 TEST Documents.
2024-06-24 03:52:37 - Doc Example: {'text': 'Alterations of the architecture of cerebral white matter in the developing human brain can affect cortical development and result in functional disabilities. A line scan diffusion-weighted magnetic resonance imaging (MRI) sequence with diffusion tensor analysis was applied to measure the apparent diffusion coefficient, to calculate relative anisotropy, and to delineate three-dimensional fiber architecture in cerebral white matter in preterm (n = 17) and full-term infants (n = 7). To assess effects of prematurity on cerebral white matter development, early gestation preterm infants (n = 10) were studied a second time at term. In the central white matter the mean apparent diffusion coefficient at 28 wk was high, 1.8 microm2/ms, and decreased toward term to 1.2 microm2/ms. In the posterior limb of the internal capsule, the mean apparent diffusion coefficients at both times were similar (1.2 versus 1.1 microm2/ms). Relative anisotropy was higher the closer birth was to term with greater absolute values in the internal capsule than in the central white matter. Preterm infants at term showed higher mean diffusion coefficients in the central white matter (1.4 +/- 0.24 versus 1.15 +/- 0.09 microm2/ms, p = 0.016) and lower relative anisotropy in both areas compared with full-term infants (white matter, 10.9 +/- 0.6 versus 22.9 +/- 3.0%, p = 0.001; internal capsule, 24.0 +/- 4.44 versus 33.1 +/- 0.6% p = 0.006). Nonmyelinated fibers in the corpus callosum were visible by diffusion tensor MRI as early as 28 wk; full-term and preterm infants at term showed marked differences in white matter fiber organization. The data indicate that quantitative assessment of water diffusion by diffusion tensor MRI provides insight into microstructural development in cerebral white matter in living infants.', 'title': 'Microstructural development of human newborn cerebral white matter assessed in vivo by diffusion tensor magnetic resonance imaging.'}
2024-06-24 03:52:37 - Loading Queries...
2024-06-24 03:52:37 - Loaded 300 TEST Queries.
2024-06-24 03:52:37 - Query Example: 0-dimensional biomaterials show inductive properties.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████| 6/6 [00:03<00:00, 1.86it/s]
2024-06-24 03:52:41 - Encoding Queries...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [00:21<00:00, 14.28it/s]
2024-06-24 03:53:02 - Sorting Corpus by document length (Longest first)...
2024-06-24 03:53:02 - Encoding Corpus in batches... Warning: This might take a while!
2024-06-24 03:53:02 - Scoring Function: Dot Product (dot)
2024-06-24 03:53:02 - Encoding Batch 1/1...
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 5183/5183 [10:01<00:00, 8.61it/s]
2024-06-24 04:03:04 - For evaluation, we ignore identical query and document ids (default), please explicitly set ``ignore_identical_ids=False`` to ignore this.
2024-06-24 04:03:04 -
2024-06-24 04:03:04 - NDCG@1: 0.3533
2024-06-24 04:03:04 - NDCG@3: 0.4226
2024-06-24 04:03:04 - NDCG@5: 0.4416
2024-06-24 04:03:04 - NDCG@10: 0.4601
2024-06-24 04:03:04 - NDCG@100: 0.4812
2024-06-24 04:03:04 - NDCG@1000: 0.4902
2024-06-24 04:03:04 -
2024-06-24 04:03:04 - MAP@1: 0.3291
2024-06-24 04:03:04 - MAP@3: 0.3942
2024-06-24 04:03:04 - MAP@5: 0.4060
2024-06-24 04:03:04 - MAP@10: 0.4145
2024-06-24 04:03:04 - MAP@100: 0.4191
2024-06-24 04:03:04 - MAP@1000: 0.4194
2024-06-24 04:03:04 -
2024-06-24 04:03:04 - Recall@1: 0.3291
2024-06-24 04:03:04 - Recall@3: 0.4764
2024-06-24 04:03:04 - Recall@5: 0.5243
2024-06-24 04:03:04 - Recall@10: 0.5780
2024-06-24 04:03:04 - Recall@100: 0.6748
2024-06-24 04:03:04 - Recall@1000: 0.7459
2024-06-24 04:03:04 -
2024-06-24 04:03:04 - P@1: 0.3533
2024-06-24 04:03:04 - P@3: 0.1744
2024-06-24 04:03:04 - P@5: 0.1160
2024-06-24 04:03:04 - P@10: 0.0650
2024-06-24 04:03:04 - P@100: 0.0076
2024-06-24 04:03:04 - P@1000: 0.0008
Hello, thanks for pointing this out. Upon further investigation, there are some differences between your implementation and our implementation:
Instruct:
. We will modify the demo example to avoid confusion. Additionally, we've uploaded a demo file eval.py
to our GitHub repository. This script has been tested on a different machine and can reproduce the results with a very minimal discrepancy (within 0.001).
Let us know if you still have difficulties in reproducing our results.
Best,
Authors
Thanks
script used for eval
results
In the paper, the reported score is
0.760
. Using the above script only gotNDCG@10: 0.3686
What is missing, please help. Thanks
@ritaranx