Closed GoogleCodeExporter closed 9 years ago
Original comment by oliver.m...@gmail.com
on 26 Apr 2011 at 8:55
Not seeing any null ranks on TC.
340730 out of 6137571 ranks in TN are null (~6%). Investigating.
Original comment by lars.fra...@gmail.com
on 26 Apr 2011 at 2:13
340639 out of 6051987 Taxon Names coming from CLB have NULL ranks. That leaves
91 other NULLs. 0 are coming from occ_taxon_name so the rest must be from
typification_record.
Original comment by lars.fra...@gmail.com
on 26 Apr 2011 at 2:25
See:
http://code.google.com/p/gbif-ecat/source/detail?r=3735
http://code.google.com/p/gbif-ecat/source/detail?r=3736
No NULL ranks should come from CLB (clb3) now
Original comment by timrobertson100
on 26 Apr 2011 at 3:27
NULL ranks from typification_record gone as well with r523
Original comment by lars.fra...@gmail.com
on 26 Apr 2011 at 5:51
Of 7.2 million names, I see 4.3m with null rank
Reopening this issue
select rank,count(1) from tim_rollover2_temp_normalized group by rank gives the
following. Note the 3 columns again suggesting a corrupt table. Perhaps the
delimiter change in the NormalizeTaxonomy needs rolled back?:
C 8668
C 2003488 1
C 3236611 1
C 5257628 1
C 5257963 1
C 5273992 1
C 5308109 1
C 5308444 1
C 5933889 1
C 5941511,5941512 1
C 5941590,5941591 1
C 5947566,5947567 1
C 5947577 1
C 7085610 1
C 7154496 1
C 7154533,7154532 1
C 876685 1
C 876731 1
F 199148
F 121257 1
F 122050 1
F 1286407 1
F 1508184 1
F 1533035 1
F 1538105 1
F 1538295 1
F 1538341 1
F 1663126 1
F 1663127 1
F 1663128 1
F 1663129 1
F 1663130 1
F 1663131 1
F 1663132 1
F 1663133 1
F 1663134 1
F 1663135 1
F 1663136 1
F 1663137 1
F 1663138 1
F 1663139 1
F 1663140 1
F 1663141 1
F 1663142 1
F 1663143 1
F 1663144 1
F 1663145 1
F 1663146 1
F 1663147 1
F 1663148 1
F 1663149 1
F 1663150 1
F 1663151 1
F 1663152 1
F 1663153 1
F 1663154 1
F 1663155 1
F 1663156 1
F 1663157 1
F 1663158 1
F 1663159 1
F 1663160 1
F 1663161 1
F 1663162 1
F 1663163 1
F 1663164 1
F 1663165 1
F 1663166 1
F 1663167 1
F 1663168 1
F 1663169 1
F 1663170 1
F 1663171 1
F 1663172 1
F 1663173 1
F 1663174 1
F 1663175 1
F 1663176 1
F 1663177 1
F 1663178 1
F 1663179 1
F 1663180 1
F 1663181 1
F 1663182 1
F 1663183 1
F 1663184 1
F 1663185 1
F 1663186 1
F 1663187 1
F 1663188 1
F 1663189 1
F 1663190 1
F 1663191 1
F 1663192 1
F 1663193 1
F 1663194 1
F 1663195 1
F 1663196 1
F 1663197 1
F 1663198 1
Original comment by timrobertson100
on 28 Apr 2011 at 8:22
Please see in Hue the tim_rollover2_temp_normalized table. It has an extra
column of NULL at the end of each row. This is likely to cause this issue:
data_resource_id local_id local_parent_id name author rank denormalized_taxonomy
_ids
10001 1 NULL Calyptrosphaera NULL G NULL
Original comment by timrobertson100
on 28 Apr 2011 at 8:26
See the revert in r533 and r534
This is needed as the MR job uses the textoutputformat which uses the \t
character
Tests running
Original comment by timrobertson100
on 28 Apr 2011 at 8:35
Confirmed reverting fixes the OR names issue - marking as fixed again
Original comment by timrobertson100
on 29 Apr 2011 at 4:59
Original issue reported on code.google.com by
timrobertson100
on 22 Apr 2011 at 6:04