how to achieve this in order to solve this problem
The implementation of the file wikitable_stat.csv will allow us to choose better tables for extracting because it allows us to see on each url the number of tables which class contains:
box (we were able to see with the tables all the tables with the box tag are irrelevant because they do not contains important informations generally used for structuring elements (photos ,text etc.. )
nav (same with tables which contains box in their class)
nowraplinks (containing links to other pages on the set of tables found and represented by a call to a plugin for the representation of this kind of tables in wikitext)
others tables on all pages that do not meet any of the conditions mentioned above are relevant and we thought we would improve it further
Revision of relevance criteria in order to have better extracted better information and well structured csv files