Open HugoSchtr opened 2 years ago
Let's test a training on a sample of re-annotated files:
> Task :train_datacat-segmenter
16:03:34.793 [main] DEBUG org.grobid.core.utilities.GrobidProperties - synchronized getNewInstance
16:03:34.800 [main] WARN org.grobid.core.main.GrobidHomeFinder - No Grobid property was provided. Attempting to find Grobid home in the current directory...
16:03:34.800 [main] WARN org.grobid.core.main.GrobidHomeFinder - ***************************************************************
16:03:34.800 [main] WARN org.grobid.core.main.GrobidHomeFinder - *** USING GROBID HOME: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home
16:03:34.800 [main] WARN org.grobid.core.main.GrobidHomeFinder - ***************************************************************
16:03:34.800 [main] DEBUG org.grobid.core.utilities.GrobidProperties - loading grobid config yaml
16:03:34.800 [main] WARN org.grobid.core.main.GrobidHomeFinder - Grobid config file location was not explicitly set via 'org.grobid.config' system variable, defaulting to: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/config/grobid.yaml
16:03:34.957 [main] DEBUG org.grobid.core.utilities.GrobidProperties - loading pdfalto command path
16:03:34.959 [main] DEBUG org.grobid.core.utilities.GrobidProperties - pdfalto executable home directory set to /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/pdfalto/lin-64
16:03:34.965 [main] INFO org.grobid.core.main.LibraryLoader - Loading external native sequence labelling library
16:03:34.965 [main] DEBUG org.grobid.core.main.LibraryLoader - /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/lib/lin-64
16:03:34.969 [main] INFO org.grobid.core.main.LibraryLoader - Loading Wapiti native library...
16:03:34.970 [main] INFO org.grobid.core.main.LibraryLoader - Native library for sequence labelling loaded
16:03:34.971 [main] DEBUG org.grobid.core.lexicon.Lexicon - Get new instance of Lexicon
16:03:34.971 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating dictionary
16:03:34.971 [main] INFO org.grobid.core.lexicon.Lexicon - End of Initialization of dictionary
16:03:34.971 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating names
16:03:34.971 [main] INFO org.grobid.core.lexicon.Lexicon - End of initialization of names
16:03:35.253 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating country codes
16:03:35.253 [main] INFO org.grobid.core.lexicon.Lexicon - End of initialization of country codes
sourceTEIPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/corpus/tei
sourceRawPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/corpus/raw
trainingOutputPath: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/tmp/datacat-segmenter5585300605279619429.train
evalOutputPath: null
82 tei files
16:03:35.346 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k97781698.training.monograph.tei.xml
Total data found between CRF and TEI files 1322 from total 1785 examples.
16:03:35.692 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k97775611.training.monograph.tei.xml
Total data found between CRF and TEI files 1342 from total 1628 examples.
16:03:35.851 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k97780940.training.monograph.tei.xml
Total data found between CRF and TEI files 67 from total 97 examples.
[...]
16:17:33.431 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777635w.training.monograph.tei.xml
Total data found between CRF and TEI files 677 from total 733 examples.
16:17:33.440 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777373j.training.monograph.tei.xml
Total data found between CRF and TEI files 779 from total 915 examples.
16:17:33.465 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777815t.training.monograph.tei.xml
Total data found between CRF and TEI files 424 from total 593 examples.
16:17:33.488 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777743s.training.monograph.tei.xml
Total data found between CRF and TEI files 1183 from total 2130 examples.
16:17:33.740 [main] DEBUG org.grobid.core.utilities.GrobidProperties - No configuration parameter defined for DeLFT engine for model datacat-segmenter
16:17:33.741 [main] INFO org.grobid.core.jni.WapitiModel - Loading model: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti (size: 3424408)
Labeling took: 310 ms
===== Field-level results =====
label accuracy precision recall f1 support
<annex> 84.5 21.43 25 23.08 12
<back> 92.25 0 0 0 7
<body> 79.84 31.58 31.58 31.58 19
<front> 65.12 28 20.59 23.73 34
all (micro avg.) 80.43 26.23 22.22 24.06 72
all (macro avg.) 80.43 20.25 19.29 19.6 72
===== Instance-level results =====
Total expected instances: 18
Correct instances: 2
Instance-level recall: 11.11
Results are not satisfying enough with that much data.
Let's try another sample, with different and more documents:
> Task :train_datacat-segmenter
16:23:38.410 [main] DEBUG org.grobid.core.utilities.GrobidProperties - synchronized getNewInstance
16:23:38.414 [main] WARN org.grobid.core.main.GrobidHomeFinder - No Grobid property was provided. Attempting to find Grobid home in the current directory...
16:23:38.414 [main] WARN org.grobid.core.main.GrobidHomeFinder - ***************************************************************
16:23:38.414 [main] WARN org.grobid.core.main.GrobidHomeFinder - *** USING GROBID HOME: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home
16:23:38.414 [main] WARN org.grobid.core.main.GrobidHomeFinder - ***************************************************************
16:23:38.414 [main] DEBUG org.grobid.core.utilities.GrobidProperties - loading grobid config yaml
16:23:38.414 [main] WARN org.grobid.core.main.GrobidHomeFinder - Grobid config file location was not explicitly set via 'org.grobid.config' system variable, defaulting to: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/config/grobid.yaml
16:23:38.545 [main] DEBUG org.grobid.core.utilities.GrobidProperties - loading pdfalto command path
16:23:38.546 [main] DEBUG org.grobid.core.utilities.GrobidProperties - pdfalto executable home directory set to /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/pdfalto/lin-64
16:23:38.550 [main] INFO org.grobid.core.main.LibraryLoader - Loading external native sequence labelling library
16:23:38.550 [main] DEBUG org.grobid.core.main.LibraryLoader - /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/lib/lin-64
16:23:38.553 [main] INFO org.grobid.core.main.LibraryLoader - Loading Wapiti native library...
16:23:38.553 [main] INFO org.grobid.core.main.LibraryLoader - Native library for sequence labelling loaded
16:23:38.554 [main] DEBUG org.grobid.core.lexicon.Lexicon - Get new instance of Lexicon
16:23:38.554 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating dictionary
16:23:38.554 [main] INFO org.grobid.core.lexicon.Lexicon - End of Initialization of dictionary
16:23:38.554 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating names
16:23:38.554 [main] INFO org.grobid.core.lexicon.Lexicon - End of initialization of names
16:23:38.791 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating country codes
16:23:38.791 [main] INFO org.grobid.core.lexicon.Lexicon - End of initialization of country codes
sourceTEIPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/corpus/tei
sourceRawPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/corpus/raw
trainingOutputPath: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/tmp/datacat-segmenter9332586470912852627.train
evalOutputPath: null
130 tei files
16:23:38.867 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9780625m.training.datacat.tei.xml
Total data found between CRF and TEI files 666 from total 875 examples.
16:23:39.043 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k97781557.training.datacat.tei.xml
Total data found between CRF and TEI files 136 from total 227 examples.
16:23:39.060 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-cb40908886q.training.datacat.tei.xml
Total data found between CRF and TEI files 421 from total 653 examples.
16:23:39.133 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k12438933.training.datacat.tei.xml
Total data found between CRF and TEI files 341 from total 432 examples.
16:23:39.149 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9779220p.training.datacat.tei.xml
Total data found between CRF and TEI files 2157 from total 3548 examples.
[...]
16:49:07.156 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777796x.training.datacat.tei.xml
Total data found between CRF and TEI files 988 from total 1403 examples.
16:49:07.232 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777643f.training.datacat.tei.xml
Total data found between CRF and TEI files 703 from total 1302 examples.
16:49:07.337 [main] DEBUG org.grobid.core.utilities.GrobidProperties - No configuration parameter defined for DeLFT engine for model datacat-segmenter
16:49:07.337 [main] INFO org.grobid.core.jni.WapitiModel - Loading model: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti (size: 9273151)
Labeling took: 1232 ms
===== Field-level results =====
label accuracy precision recall f1 support
<annex> 98.25 0 0 0 1
<back> 92.98 0 0 0 10
<body> 78.36 48.48 44.44 46.38 36
<front> 67.84 42.42 28 33.73 50
all (micro avg.) 84.36 42.86 30.93 35.93 97
all (macro avg.) 84.36 22.73 18.11 20.03 97
===== Instance-level results =====
Total expected instances: 33
Correct instances: 11
Instance-level recall: 33.33
Results are better than previous training, and better than previous models trained with more complex labels and more documents. This is encouraging.
With the new debugged high-level segmentation model, training is now working as intended. Scores are now way better, and even if they're not perfect yet, extraction is already satisfying.
11:31:42.396 [main] DEBUG org.grobid.core.utilities.GrobidProperties - synchronized getNewInstance
11:31:42.401 [main] WARN org.grobid.core.main.GrobidHomeFinder - No Grobid property was provided. Attempting to find Grobid home in the current directory...
11:31:42.401 [main] WARN org.grobid.core.main.GrobidHomeFinder - ***************************************************************
11:31:42.401 [main] WARN org.grobid.core.main.GrobidHomeFinder - *** USING GROBID HOME: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home
11:31:42.401 [main] WARN org.grobid.core.main.GrobidHomeFinder - ***************************************************************
11:31:42.401 [main] DEBUG org.grobid.core.utilities.GrobidProperties - loading grobid config yaml
11:31:42.401 [main] WARN org.grobid.core.main.GrobidHomeFinder - Grobid config file location was not explicitly set via 'org.grobid.config' system variable, defaulting to: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/config/grobid.yaml
11:31:42.544 [main] DEBUG org.grobid.core.utilities.GrobidProperties - loading pdfalto command path
11:31:42.545 [main] DEBUG org.grobid.core.utilities.GrobidProperties - pdfalto executable home directory set to /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/pdfalto/lin-64
11:31:42.550 [main] INFO org.grobid.core.main.LibraryLoader - Loading external native sequence labelling library
11:31:42.550 [main] DEBUG org.grobid.core.main.LibraryLoader - /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/lib/lin-64
11:31:42.553 [main] INFO org.grobid.core.main.LibraryLoader - Loading Wapiti native library...
11:31:42.553 [main] INFO org.grobid.core.main.LibraryLoader - Native library for sequence labelling loaded
11:31:42.554 [main] DEBUG org.grobid.core.lexicon.Lexicon - Get new instance of Lexicon
11:31:42.554 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating dictionary
11:31:42.554 [main] INFO org.grobid.core.lexicon.Lexicon - End of Initialization of dictionary
11:31:42.554 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating names
11:31:42.554 [main] INFO org.grobid.core.lexicon.Lexicon - End of initialization of names
11:31:42.805 [main] INFO org.grobid.core.lexicon.Lexicon - Initiating country codes
11:31:42.805 [main] INFO org.grobid.core.lexicon.Lexicon - End of initialization of country codes
sourceTEIPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/corpus/tei
sourceRawPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/corpus/raw
trainingOutputPath: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/tmp/datacat-segmenter11504674491793600357.train
evalOutputPath: null
363 tei files
11:31:42.940 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9779206d.training.datacat.tei.xml
11:31:43.000 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9778458h.training.datacat.tei.xml
11:31:43.007 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9781821z.training.datacat.tei.xml
11:31:43.018 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9779365m.training.datacat.tei.xml
11:31:43.020 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9780625m.training.datacat.tei.xml
[...]
11:31:46.320 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9780628v.training.datacat.tei.xml
epsilon: 1.0E-7
window: 50
nb max iterations: 2000
nb threads: 16
Model for datacat-segmenter created in 6530131 ms
sourceTEIPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/evaluation/tei
sourceRawPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/evaluation/raw
trainingOutputPath: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/tmp/datacat-segmenter6461545812222573652.test
evalOutputPath: null
73 tei files
13:20:33.071 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777971s.training.datacat.tei.xml
13:20:33.082 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777565p.training.datacat.tei.xml
13:20:33.086 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777416v.training.datacat.tei.xml
13:20:33.114 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777569b.training.datacat.tei.xml
13:20:33.119 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777635w.training.datacat.tei.xml
13:20:33.125 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777738g.training.datacat.tei.xml
13:20:33.136 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777980r.training.datacat.tei.xml
13:20:33.139 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777764z.training.datacat.tei.xml
13:20:33.151 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777648h.training.datacat.tei.xml
13:20:33.154 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 2148-bpt6k97784009.training.datacat.tei.xml
13:20:33.155 [main] ERROR org.grobid.trainer.AbstractTrainer - The raw file does not exist: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/evaluation/raw/2148-bpt6k97784009.training.datacat
13:20:33.155 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777351z.training.datacat.tei.xml
[...]
13:20:33.730 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777387k.training.datacat.tei.xml
13:20:33.737 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9777385r.training.datacat.tei.xml
13:20:33.750 [main] DEBUG org.grobid.core.utilities.GrobidProperties - No configuration parameter defined for DeLFT engine for model datacat-segmenter
13:20:33.750 [main] INFO org.grobid.core.jni.WapitiModel - Loading model: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti (size: 37965878)
Labeling took: 5812 ms
===== Field-level results =====
label accuracy precision recall f1 support
<annex> 96.52 58.33 63.64 60.87 22
<back> 93.04 55.17 41.03 47.06 39
<body> 87.81 58.11 57.33 57.72 75
<front> 68.86 44.44 32.21 37.35 149
all (micro avg.) 86.56 51.49 42.46 46.54 285
all (macro avg.) 86.56 54.01 48.55 50.75 285
===== Instance-level results =====
Total expected instances: 72
Correct instances: 15
Instance-level recall: 20.83
163 files used, from the same collection (bienaimé-feuardent), random split on the corpus for evaluating the model.
[...]
14:02:54.391 [main] INFO org.grobid.core.lexicon.Lexicon - End of initialization of country codes
sourceTEIPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/corpus/tei
sourceRawPathLabel: /home/hscheith/dev/grobid/grobid-datacat/resources/dataset/datacat-segmenter/corpus/raw
trainingOutputPath: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/tmp/datacat-segmenter6091050958366937291.train
evalOutputPath: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/tmp/datacat-segmenter8384666151559798433.test
163 tei files
14:02:54.474 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9780625m.training.datacat.tei.xml
[...]
14:02:56.380 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9780628v.training.datacat.tei.xml
epsilon: 1.0E-7
window: 50
nb max iterations: 2000
nb threads: 16
* Load patterns
* Load training data
* Initialize the model
* Summary
nb train: 152
nb labels: 9
nb blocks: 1534981
nb features: 13814901
* Train the model with l-bfgs
[ 1] obj=280716.19 act=3363480 err= 7.65%/100.00% time=7.37s/7.37s
[...]
[ 247] obj=310.02 act=3214 err= 0.00%/ 0.66% time=4.97s/1525.94s
* Save the model
* Done
14:28:33.621 [main] DEBUG org.grobid.core.utilities.GrobidProperties - No configuration parameter defined for DeLFT engine for model datacat-segmenter
14:28:33.621 [main] INFO org.grobid.core.jni.WapitiModel - Loading model: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti (size: 24164096)
[Wapiti] Loading model: "/home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti"
Model path: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti
Labeling took: 587 ms
===== Field-level results =====
label accuracy precision recall f1 support
<back> 88.89 0 0 0 1
<body> 92.59 87.5 87.5 87.5 8
<front> 70.37 55.56 55.56 55.56 9
all (micro avg.) 83.95 63.16 66.67 64.86 18
all (macro avg.) 83.95 47.69 47.69 47.69 18
===== Instance-level results =====
Total expected instances: 8
Correct instances: 5
Instance-level recall: 62.5
Split, training and evaluation for datacat-segmenter model is realized in 1541156 ms
Observation: not enough data for each label, I believe.
Results varies depending on the evaluation set, for example, an other model trained on the same dataset, but with a different split:
===== Field-level results =====
label accuracy precision recall f1 support
<annex> 97.03 0 0 0 1
<back> 89.11 33.33 22.22 26.67 9
<body> 92.08 66.67 76.92 71.43 13
<front> 65.35 31.25 17.24 22.22 29
all (micro avg.) 85.89 43.59 32.69 37.36 52
all (macro avg.) 85.89 32.81 29.1 30.08 52
New training with the Bourgey corpus has the following scores (436 tei files, with 95% used for training, and 5% for evaluation).
436 tei files
13:54:43.836 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9779206d.training.datacat.tei.xml
[...]
13:54:47.547 [main] INFO org.grobid.trainer.AbstractTrainer - Processing: 12148-bpt6k9780628v.training.datacat.tei.xml
epsilon: 1.0E-7
window: 50
nb max iterations: 1000
nb threads: 16
* Load patterns
* Load training data
* Initialize the model
* Summary
nb train: 405
nb labels: 9
nb blocks: 2501866
nb features: 22516866
* Train the model with l-bfgs
[ 1] obj=1456315.19 act=5464835 err=22.25%/100.00% time=11.04s/11.04s
[...]
[ 715] obj=804.87 act=4649 err= 0.00%/ 0.99% time=8.62s/6631.52s
* Save the model
* Done
15:45:43.330 [main] DEBUG org.grobid.core.utilities.GrobidProperties - No configuration parameter defined for DeLFT engine for model datacat-segmenter
15:45:43.330 [main] INFO org.grobid.core.jni.WapitiModel - Loading model: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti (size: 41682587)
[Wapiti] Loading model: "/home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti"
Model path: /home/hscheith/dev/grobid/grobid-datacat/../grobid-home/models/datacat-segmenter/model.wapiti
Labeling took: 2058 ms
===== Field-level results =====
label accuracy precision recall f1 support
<annex> 98.33 87.5 87.5 87.5 8
<back> 97.5 88.89 80 84.21 10
<body> 85.83 67.86 70.37 69.09 27
<front> 76.67 60.53 63.89 62.16 36
all (micro avg.) 89.58 68.67 70.37 69.51 81
all (macro avg.) 89.58 76.19 75.44 75.74 81
===== Instance-level results =====
Total expected instances: 27
Correct instances: 15
Instance-level recall: 55.56
Let's try to balance the training corpus and observe the results.
Since the model's performance for the high level segmentation won't go higher than ~45% recall/precision/F1, we're trying a new segmentation, much simpler:
As the last experiments, we're aiming for a very good performance for
body
, which contains all sale catalogues entries.