hongcui / FNATextProcessing

producing clean FNA input
1 stars 3 forks source link

There are duplicate files BETULOIDEAE.xml and BETULACEAE_subfam._BETULOIDEAE.xml in Volume 3 #22

Open jocelynpender opened 6 years ago

jocelynpender commented 6 years ago

There are duplicate files BETULOIDEAE.xml and BETULACEAE_subfam._BETULOIDEAE.xml

family betulaceae;genus BETULOIDEAE
jocelynpender commented 6 years ago

These are the files I moved to a dupes folder and ignored:

Input/FNA_Github_June_15_2016/V3/numerical_files/no_keys/ { → dupes } /1129.xml Input/FNA_Github_June_15_2016/V3/numerical_files/no_keys/ { → dupes } /366.xml

bibilujan commented 6 years ago

These duplicate files still in the folder V3/numerical_files folder. I also realized that they are tagged as genus instead of subfamily (files 70 and 366.xml in V3). The same occurs with the subfamily Coryloideae in files 313 and 1129.xml in V3.

The duplicated files will be removed (the are still in Input/FNA_Github_June_15_2016/V3/numerical_files/no_keys/ { → dupes }) and the taxon name, hierarchy and authority will be manually fixed in files 70 and 313 (the non duplicate files that were kept).