Closed GoogleCodeExporter closed 9 years ago
Taxon_concept_id in question: 53354205
Original comment by timrobertson100
on 27 Apr 2011 at 6:11
select * from tim_rollover2_temp_taxon_distinct_taxonomy where
taxon_concept_id= 53354205
=> 53354205 S
How can this be S when it is an INT ID?
Original comment by timrobertson100
on 27 Apr 2011 at 6:16
select * from tim_rollover2_temp_normalized3 where id= 53354205
53354205 53354202 1472 NULL 0 ["S"]
Indeed it is "S"
Original comment by timrobertson100
on 27 Apr 2011 at 6:21
select * from tim_rollover2_temp_normalized2 where id= 53354205
0 53354205 1472 123 120 Spatoglossum asperum J. Agardh 0 S
Still "S" and still 1472
Original comment by timrobertson100
on 27 Apr 2011 at 6:26
Strange result in the temp_distinct_taxonomy itself:
select * from tim_rollover2_temp_distinct_taxonomy where data_resource_id=1472
and scientific_name='Spatoglossum asperum'
2706679 1472 NULL NULL NULL NULL Dictyotaceae Spatoglossum Spatoglossum
asperum J.
Agardh ["57325397","57325399","57325422","57325425","57325439","57325447","57325
465","57325471","57325497","57325499","57325501","57325503"]
2706680 1472 NULL NULL NULL NULL Dictyotaceae Spatoglossum Spatoglossum
asperum J. Agardh ` ["57325451"]
The columns don't align and there is a spurious ` either in the author or the
occurrence field.
Original comment by timrobertson100
on 27 Apr 2011 at 6:31
Hive is producing strange results.
Getting the CSV file of the following query:
select author from tim_rollover2_temp_distinct_taxonomy where
data_resource_id=1472 and scientific_name='Spatoglossum asperum'
provides:
"_col0"
"J. Agardh"
"J. Agardh","`"
This seems to be throwing everything off subsequently, and likely to be the
cause of the real issue logged here.
Original comment by timrobertson100
on 27 Apr 2011 at 6:35
Looking at the LIVE portal, we see:
mysql> select data_resource_id,author from raw_occurrence_record where
id=57325451;
+------------------+-------------+
| data_resource_id | author |
+------------------+-------------+
| 1472 | J. Agardh ` |
+------------------+-------------+
The good news is that the DR id looks aligned, so it seems this field is
throwing everything off, and misaligning joins.
Need to investigate what the character is that is causing this
Original comment by timrobertson100
on 27 Apr 2011 at 6:41
It is a \t
Original comment by timrobertson100
on 27 Apr 2011 at 6:46
Original comment by timrobertson100
on 27 Apr 2011 at 6:48
Original comment by oliver.m...@gmail.com
on 28 Apr 2011 at 7:24
This appears to be solved due to \t (and other bad char) cleanup in the source.
Original comment by oliver.m...@gmail.com
on 25 May 2011 at 8:02
Original issue reported on code.google.com by
timrobertson100
on 27 Apr 2011 at 6:08