Closed jhammock closed 8 years ago
Could you provide an example of a duplicate taxon page, please?
Are these two duplicates? http://eol.org/pages/13997/overview and http://eol.org/pages/52632/overview ?
...If so, I think I see where the problem is (but it's REALLY complicated code, so will take a while to untangle)...
Not duplicates of the type I'm seeing for this problem- I'm seeing new taxon pages with only one resource (eg; BioTraits or Feller) in them, generated in addition to existing, corresponding taxon pages with all other content in them. The two taxa in your comment look like garden variety pre-existing in our system duplicates to me. Multiple partners represented in each.
Drat. Okay, I'll keep looking. Thanks.
Sorry to ask, but for reasons that are complicated, could you provide an example from Anne's BioTraits?
(Basically, I have good logs of the harvest of that one, but no logs for the other, so it's much easier to track down what happened.)
Sure :)
Original page: http://eol.org/pages/901261/overview
BioTraits page: http://eol.org/pages/45876733/overview
Well. Pretty major bug, it looks like: new resources aren't even getting compared, because they aren't added to Solr (matches are all done by comparing things in Solr... this was a change from many years ago and was done because PL found it to be quite a bit faster, even though it's much less clear what's happening). Anyway: Interesting. ...Digging further!
A couple of artifacts from when this first came up last year:
The SI Type data import where we first noticed it, some discussion in the later comments, Eli and Katja: https://eol-jira.bibalex.org/browse/WEB-5843
@JRice to @YoustinaAtef, a few weeks later: (https://eol-jira.bibalex.org/browse/WEB-5896) We need someone to go into the Harvesting Process and figure out where taxon concept resolution occurs (i.e.: given a new hierarchy entry, how does the code decide which taxon concept to place it in, or to make it a new node?) ...We then need this code deeply understood, with (ideally) a graphic representation of the decision tree used to the place the entry. THIS IS VERY VERY IMPORTANT. It is a blocker for many other tickets, though I'm not calling it a "blocker" for all development.
Youstina wrote up a report and attached it to the ticket. We don't have access to JIRA attachments yet, but I suppose she might have an offline copy. Either way, she can perhaps shed some light here. (Of course, adding new resources to Solr is perhaps pre-supposed for that process...)
Sorry, just need to quickly brain-dump. This won't sound like a day's worth of work, either, but it's... complicated. 1502 is the hierarchy id for Anne's resource; it should match an entry in hierarchy 903 (among others) to map to the page Jen mentions.
@solr = SolrCore::HierarchyEntryRelationships.new
r = @solr.paginate("hierarchy_id_1:1502"); r["response"]["numFound"] # => 0
r = @solr.paginate("hierarchy_id_1:903"); r["response"]["numFound"] # => 15827973
...So it's not making it to the HER Solr core... But it is in HE core:
@solr = SolrCore::HierarchyEntries.new
r = @solr.paginate("hierarchy_id:903"); r["response"]["numFound"] # => 986586
...Looking into why. FTR, I did (re-)call the relator (which should have build HER core):
hierarchy = Hierarchy.find 1502
event = hierarchy.resource.latest_harvest_event
ids = event.new_hierarchy_entry_ids
Hierarchy::Relator.relate(hierarchy, entry_ids: ids)
... And it looked like it was inserting something... :) But clearly not into HER core. (ahem)
Preeeeeeeeety sure I found the problem; a wrong variable name when building HER cores, which would have silently found nothing to do. :(
Now I'm not sure this works, since it's never actually run. :( Moving forward...
Fixed the underlying problem... now to see if it will merge its taxa! :S
Well, that was a mixed result. :S It found the match, but it looks like it just ... unpublished Anne's page, rather than properly merging it. ...Gotta look into THAT now...
Some of these look dangerous. :S Logging them here for posterity:
** 21:44:00.730(0.02) MATCH: Concept 2492255 = 45876581
** 21:44:01.540(0.02) MATCH: Concept 2492297 = 45876764
** 21:44:02.050(0.02) MATCH: Concept 2492234 = 45876750
** 21:44:02.603(0.02) MATCH: Concept 2492257 = 45876680
** 21:44:03.210(0.03) MATCH: Concept 2492261 = 45876567
** 21:44:03.719(0.02) MATCH: Concept 12141711 = 45876617
** 21:44:04.344(0.04) MATCH: Concept 2915013 = 45876762
** 21:44:05.043(0.04) MATCH: Concept 3223 = 45876537
** 21:44:05.678(0.02) MATCH: Concept 900689 = 45876538
** 21:44:06.301(0.02) MATCH: Concept 12082 = 45876540
** 21:44:06.857(0.05) MATCH: Concept 11731 = 45876539
** 21:44:07.438(0.02) MATCH: Concept 4783 = 45876541
** 21:44:07.984(0.02) MATCH: Concept 11699 = 45876542
** 21:44:08.538(0.03) MATCH: Concept 901278 = 45876643
** 21:44:09.174(0.03) MATCH: Concept 899711 = 45876623
** 21:44:09.880(0.02) MATCH: Concept 898727 = 45876759
** 21:44:10.509(0.03) MATCH: Concept 900829 = 45876597
** 21:44:11.001(0.02) MATCH: Concept 907095 = 45876692
** 21:44:11.598(0.02) MATCH: Concept 907106 = 45876566
** 21:44:12.139(0.02) MATCH: Concept 901252 = 45876760
** 21:44:12.682(0.02) MATCH: Concept 901045 = 45876681
** 21:44:13.221(0.03) MATCH: Concept 901284 = 45876632
** 21:44:13.693(0.03) MATCH: Concept 901265 = 45876718
** 21:44:14.288(0.02) MATCH: Concept 898881 = 45876667
** 21:44:15.314(0.02) MATCH: Concept 901037 = 45876717
** 21:44:15.858(0.02) MATCH: Concept 899682 = 45876767
** 21:44:16.446(0.09) MATCH: Concept 901277 = 45876651
** 21:44:17.018(0.02) MATCH: Concept 918955 = 45876711
** 21:44:17.603(0.02) MATCH: Concept 918960 = 45876568
** 21:44:18.110(0.02) MATCH: Concept 899710 = 45876625
** 21:44:18.645(0.02) MATCH: Concept 898878 = 45876693
** 21:44:19.257(0.02) MATCH: Concept 899718 = 45876586
** 21:44:19.824(0.02) MATCH: Concept 901291 = 45876599
** 21:44:20.392(0.02) MATCH: Concept 907089 = 45876721
** 21:44:20.913(0.02) MATCH: Concept 902096 = 45876649
** 21:44:21.468(0.02) MATCH: Concept 907099 = 45876650
** 21:44:22.549(0.03) MATCH: Concept 901271 = 45876691
** 21:44:23.241(0.02) MATCH: Concept 901276 = 45876670
** 21:44:23.872(0.02) MATCH: Concept 907102 = 45876609
** 21:44:24.391(0.02) MATCH: Concept 899709 = 45876633
** 21:44:24.928(0.02) MATCH: Concept 901290 = 45876604
** 21:44:25.519(0.02) MATCH: Concept 901296 = 45876587
** 21:44:26.045(0.02) MATCH: Concept 907090 = 45876719
** 21:44:26.565(0.02) MATCH: Concept 898874 = 45876753
** 21:44:27.121(0.02) MATCH: Concept 901050 = 45876644
** 21:44:27.705(0.03) MATCH: Concept 899696 = 45876725
** 21:44:28.349(0.02) MATCH: Concept 901269 = 45876697
** 21:44:28.923(0.02) MATCH: Concept 901257 = 45876741
** 21:44:29.459(0.02) MATCH: Concept 907092 = 45876712
** 21:44:29.981(0.02) MATCH: Concept 898884 = 45876570
** 21:44:31.128(0.02) MATCH: Concept 907109 = 45876554
** 21:44:31.641(0.02) MATCH: Concept 901289 = 45876611
** 21:44:32.132(0.02) MATCH: Concept 901295 = 45876589
** 21:44:32.673(0.02) MATCH: Concept 901249 = 45876769
** 21:44:33.180(0.02) MATCH: Concept 907110 = 45876552
** 21:44:33.805(0.02) MATCH: Concept 907101 = 45876629
** 21:44:34.356(0.03) MATCH: Concept 907100 = 45876640
** 21:44:34.900(0.02) MATCH: Concept 901049 = 45876653
** 21:44:35.541(0.03) MATCH: Concept 899695 = 45876726
** 21:44:36.271(0.03) MATCH: Concept 901282 = 45876636
** 21:44:36.839(0.03) MATCH: Concept 901263 = 45876723
** 21:44:37.376(0.03) MATCH: Concept 907096 = 45876690
** 21:44:37.963(0.03) MATCH: Concept 916275 = 45876595
** 21:44:38.543(0.02) MATCH: Concept 901268 = 45876699
** 21:44:39.139(0.1) MATCH: Concept 907103 = 45876603
** 21:44:39.702(0.03) MATCH: Concept 907093 = 45876703
** 21:44:40.280(0.02) MATCH: Concept 907091 = 45876715
** 21:44:40.847(0.02) MATCH: Concept 907087 = 45876724
** 21:44:41.382(0.02) MATCH: Concept 901262 = 45876728
** 21:44:41.893(0.02) MATCH: Concept 916277 = 45876590
** 21:44:42.550(0.03) MATCH: Concept 899714 = 45876605
** 21:44:43.062(0.03) MATCH: Concept 896134 = 45876698
** 21:44:43.888(0.02) MATCH: Concept 916276 = 45876593
** 21:44:44.553(0.02) MATCH: Concept 901048 = 45876654
** 21:44:45.072(0.03) MATCH: Concept 899694 = 45876729
** 21:44:45.652(0.03) MATCH: Concept 901281 = 45876638
** 21:44:46.157(0.02) MATCH: Concept 901287 = 45876618
** 21:44:46.689(0.02) MATCH: Concept 898873 = 45876761
** 21:44:47.406(0.03) MATCH: Concept 899721 = 45876558
** 21:44:48.122(0.02) MATCH: Concept 901274 = 45876686
** 21:44:48.679(0.02) MATCH: Concept 901301 = 45876544
** 21:44:49.271(0.02) MATCH: Concept 916274 = 45876612
** 21:44:49.827(0.02) MATCH: Concept 901261 = 45876733
** 21:44:50.518(0.09) MATCH: Concept 899713 = 45876621
** 21:44:51.130(0.02) MATCH: Concept 907108 = 45876562
** 21:44:51.881(0.02) MATCH: Concept 901047 = 45876659
** 21:44:52.474(0.02) MATCH: Concept 899693 = 45876732
** 21:44:53.501(0.02) MATCH: Concept 898872 = 45876770
** 21:44:54.020(0.03) MATCH: Concept 916273 = 45876756
** 21:44:54.568(0.03) MATCH: Concept 918959 = 45876627
** 21:44:55.108(0.02) MATCH: Concept 901260 = 45876737
** 21:44:55.635(0.02) MATCH: Concept 899712 = 45876619
** 21:44:56.152(0.02) MATCH: Concept 898879 = 45876671
** 21:44:56.723(0.02) MATCH: Concept 898728 = 45876735
** 21:44:57.348(0.03) MATCH: Concept 907094 = 45876700
** 21:44:57.889(0.02) MATCH: Concept 901270 = 45876696
** 21:44:58.411(0.03) MATCH: Concept 899719 = 45876579
** 21:44:59.140(0.03) MATCH: Concept 901038 = 45876708
** 21:44:59.742(0.02) MATCH: Concept 901272 = 45876687
** 21:45:00.290(0.03) MATCH: Concept 2492318 = 45876714
** 21:45:01.001(0.03) MATCH: Concept 2492259 = 45876754
** 21:45:01.521(0.02) MATCH: Concept 2492251 = 45876746
** 21:45:02.112(0.09) MATCH: Concept 2492243 = 45876564
** 21:45:03.042(0.03) MATCH: Concept 2492263 = 45876672
** 21:45:03.556(0.02) MATCH: Concept 2492303 = 45876559
** 21:45:04.066(0.02) MATCH: Concept 2492312 = 45876602
** 21:45:04.799(0.02) MATCH: Concept 2492248 = 45876637
** 21:45:05.335(0.02) MATCH: Concept 2492247 = 45876630
** 21:45:05.858(0.02) MATCH: Concept 2492321 = 45876772
** 21:45:06.374(0.02) MATCH: Concept 2492260 = 45876648
** 21:45:06.950(0.02) MATCH: Concept 2492308 = 45876745
** 21:45:07.501(0.02) MATCH: Concept 2492310 = 45876550
** 21:45:08.074(0.02) MATCH: Concept 2492316 = 45876676
** 21:45:08.616(0.02) MATCH: Concept 2492295 = 45876768
** 21:45:09.136(0.02) MATCH: Concept 2492244 = 45876606
** 21:45:09.655(0.02) MATCH: Concept 2492246 = 45876624
** 21:45:10.269(0.02) MATCH: Concept 2492306 = 45876702
** 21:45:10.805(0.02) MATCH: Concept 2492315 = 45876628
** 21:45:11.347(0.02) MATCH: Concept 2492252 = 45876749
** 21:45:11.856(0.02) MATCH: Concept 2492319 = 45876751
** 21:45:12.395(0.02) MATCH: Concept 2492258 = 45876549
** 21:45:13.286(0.02) MATCH: Concept 2492242 = 45876556
** 21:45:13.930(0.02) MATCH: Concept 2492305 = 45876631
** 21:45:14.464(0.02) MATCH: Concept 2492254 = 45876763
** 21:45:15.164(0.03) MATCH: Concept 3061127 = 45876713
** 21:45:15.722(0.02) MATCH: Concept 3061006 = 45876743
** 21:45:16.292(0.02) MATCH: Concept 3061021 = 45876572
** 21:45:16.811(0.02) MATCH: Concept 3061150 = 45876744
** 21:45:17.373(0.02) MATCH: Concept 3061163 = 45876765
** 21:45:17.954(0.03) MATCH: Concept 3061075 = 45876652
** 21:45:18.460(0.02) MATCH: Concept 3061149 = 45876742
** 21:45:18.971(0.02) MATCH: Concept 3061030 = 45876588
** 21:45:19.605(0.02) MATCH: Concept 3061002 = 45876555
** 21:45:20.106(0.02) MATCH: Concept 3061148 = 45876740
** 21:45:20.685(0.03) MATCH: Concept 3061029 = 45876585
** 21:45:21.197(0.02) MATCH: Concept 2915112 = 45876738
** 21:45:21.704(0.02) MATCH: Concept 3061123 = 45876706
** 21:45:22.754(0.02) MATCH: Concept 3061028 = 45876582
** 21:45:23.387(0.02) MATCH: Concept 3061026 = 45876578
** 21:45:24.015(0.03) MATCH: Concept 3061144 = 45876736
** 21:45:24.555(0.03) MATCH: Concept 3061050 = 45876622
** 21:45:25.141(0.03) MATCH: Concept 3061049 = 45876616
** 21:45:25.735(0.02) MATCH: Concept 3061025 = 45876577
** 21:45:26.263(0.02) MATCH: Concept 3061024 = 45876576
** 21:45:26.818(0.02) MATCH: Concept 3061090 = 45876678
** 21:45:27.551(0.03) MATCH: Concept 3061068 = 45876641
** 21:45:28.195(0.02) MATCH: Concept 3061140 = 45876731
** 21:45:28.751(0.02) MATCH: Concept 3061089 = 45876677
** 21:45:29.278(0.03) MATCH: Concept 3061139 = 45876730
** 21:45:29.797(0.02) MATCH: Concept 3061115 = 45876701
** 21:45:30.326(0.02) MATCH: Concept 3060992 = 45876546
** 21:45:30.830(0.02) MATCH: Concept 3061020 = 45876569
** 21:45:31.370(0.03) MATCH: Concept 3061384 = 45876626
** 21:45:31.922(0.03) MATCH: Concept 3061042 = 45876598
** 21:45:32.435(0.02) MATCH: Concept 3061018 = 45876565
** 21:45:32.951(0.02) MATCH: Concept 3061085 = 45876669
** 21:45:33.563(0.02) MATCH: Concept 3061084 = 45876665
** 21:45:34.076(0.02) MATCH: Concept 3061015 = 45876563
** 21:45:34.606(0.02) MATCH: Concept 3061083 = 45876663
** 21:45:35.109(0.02) MATCH: Concept 3061087 = 45876674
** 21:45:35.706(0.09) MATCH: Concept 3061082 = 45876661
** 21:45:36.264(0.02) MATCH: Concept 3061107 = 45876695
** 21:45:36.840(0.03) MATCH: Concept 3061081 = 45876660
** 21:45:37.598(0.23) MATCH: Concept 3061038 = 45876592
** 21:45:38.170(0.02) MATCH: Concept 3061037 = 45876591
** 21:45:38.773(0.02) MATCH: Concept 3061105 = 45876694
** 21:45:39.307(0.04) MATCH: Concept 896917 = 45876657
** 21:45:39.869(0.02) MATCH: Concept 288 = 45876536
** 21:45:40.708(0.02) MATCH: Concept 907105 = 45876775
** 21:45:42.012(0.03) MATCH: Concept 23186533 = 0
** 21:45:42.746(0.02) MATCH: Concept 540519 = 0
** 21:45:43.417(0.02) MATCH: Concept 23816253 = 45876600
** 21:45:43.985(0.02) MATCH: Concept 32281306 = 45876679
** 21:45:44.945(0.02) MATCH: Concept 896915 = 45876707
** 21:45:46.241(0.02) MATCH: Concept 38996124 = 45876758
** 21:45:46.941(0.02) MATCH: Concept 896914 = 45876594
** 21:45:47.585(0.02) MATCH: Concept 907088 = 0
** 21:45:49.026(0.02) MATCH: Concept 8804237 = 45876642
** 21:45:49.606(0.02) MATCH: Concept 8804239 = 45876560
** 21:45:50.187(0.02) MATCH: Concept 918961 = 45876684
** 21:45:50.775(0.02) MATCH: Concept 28631172 = 45876780
** 21:45:51.740(0.03) MATCH: Concept 38996136 = 45876755
** 21:45:52.268(0.02) MATCH: Concept 38996138 = 45876547
** 21:45:52.807(0.04) MATCH: Concept 38996140 = 45876575
** 21:45:53.369(0.02) MATCH: Concept 38996141 = 45876580
** 21:45:53.881(0.02) MATCH: Concept 38996145 = 45876656
** 21:45:54.463(0.02) MATCH: Concept 38996146 = 45876655
** 21:45:55.054(0.02) MATCH: Concept 38996147 = 45876668
** 21:45:55.584(0.02) MATCH: Concept 38996148 = 45876688
** 21:45:56.217(0.02) MATCH: Concept 38996149 = 45876727
** 21:45:57.197(0.02) MATCH: Concept 38996151 = 45876752
** 21:45:57.991(0.02) MATCH: Concept 38996156 = 45876709
** 21:45:58.506(0.02) MATCH: Concept 38996157 = 45876722
** 21:45:59.016(0.02) MATCH: Concept 38996159 = 45876543
** 21:45:59.560(0.02) MATCH: Concept 38996162 = 45876553
** 21:46:00.148(0.02) MATCH: Concept 38996163 = 45876561
** 21:46:00.662(0.02) MATCH: Concept 38996164 = 45876573
** 21:46:01.166(0.02) MATCH: Concept 38996165 = 45876574
** 21:46:01.746(0.02) MATCH: Concept 38996166 = 45876583
** 21:46:02.345(0.02) MATCH: Concept 38996167 = 45876601
** 21:46:02.870(0.02) MATCH: Concept 38996168 = 45876607
** 21:46:03.395(0.02) MATCH: Concept 38996169 = 45876608
** 21:46:03.927(0.03) MATCH: Concept 38996171 = 45876614
** 21:46:04.496(0.02) MATCH: Concept 38996172 = 45876620
** 21:46:05.009(0.02) MATCH: Concept 38996173 = 45876639
** 21:46:05.889(0.02) MATCH: Concept 38996174 = 45876662
** 21:46:06.410(0.02) MATCH: Concept 38996175 = 45876664
** 21:46:07.063(0.02) MATCH: Concept 38996176 = 45876666
** 21:46:07.616(0.02) MATCH: Concept 38996177 = 45876673
** 21:46:08.132(0.02) MATCH: Concept 38996179 = 45876704
** 21:46:08.655(0.02) MATCH: Concept 38996180 = 45876716
** 21:46:09.280(0.02) MATCH: Concept 38996182 = 45876720
** 21:46:09.781(0.02) MATCH: Concept 38996183 = 45876739
** 21:46:10.311(0.02) MATCH: Concept 38996188 = 45876766
** 21:46:10.835(0.02) MATCH: Concept 38996189 = 45876610
** 21:46:11.379(0.02) MATCH: Concept 38996191 = 45876557
** 21:46:11.950(0.02) MATCH: Concept 38996195 = 45876596
** 21:46:12.457(0.02) MATCH: Concept 38996199 = 45876634
** 21:46:12.968(0.02) MATCH: Concept 902090 = 45876645
** 21:46:13.483(0.02) MATCH: Concept 38996203 = 45876647
** 21:46:14.438(0.02) MATCH: Concept 38996208 = 45876710
** 21:46:14.990(0.02) MATCH: Concept 38996209 = 45876748
** 21:46:15.540(0.02) MATCH: Concept 38996200 = 0
** 21:46:16.347(0.05) MATCH: Concept 898848 = 45876682
(I am, of course, deeply disturbed by the "0s" ... I am hoping that 's just a logging error!)
FWIW some of us are tentatively happy the spurious page was unpublished. We were wondering if all these pages would have to be cleaned up manually once this is fixed. Yayyyy....?
Aaaaacutally, it looks like it does work and the site's "auto-redirection" is not working. Blast! ...But probably a separate ticket, sadly...
UPDATE: indeed, _it never even calls the redirect_ifsuperceded method anymore ...I just added it; I hope I put it in the right place, though. :S
UPDATE 2: Nope, it must not be in the right place; still not working (on beta.eol.org where I've updated it)
Okay, now we have two (new) problems.
These sound like things that may also have been affecting our Admin taxon concept management experience in recent months. Two birds, one stone?
No, I don't think these are related; this is a bug PURELY with the new harvesting code, not with concept management. :S Sorry! (That said, concept management SHOULD (really) be re-written to use the new Ruby code, but ... I doubt we'll get thumbs-up for that before Tramea.)
Anyway, I managed to write the code to move traits, but I haven't fixed all of those merges with 0. :S Will have to do that next week. Sorry for the missing pages! :( :( :(
ALSO need to fix: the merge does not reindex the page, and it needs to. That's a bit of a pain. :(
Roger that; don't worry about the wait on this resource; we're still fiddling with it.
Dammit, it looks like I really screwed up those pages with the 0s after the ='s. :(
HierarchyEntry.where(taxon_concept_id: 0).count
(0.5ms) SELECT COUNT(*) FROM `hierarchy_entries` WHERE `hierarchy_entries`.`taxon_concept_id` = 0
=> 44
...These will have to be manually matched back to the right id. Luckily, there aren't too many ids to match against, just the 0s:
23186533
540519
907088
38996200
These pages are lost for good. They should have had redirects taking them to the right page (2 paragraphs down), but because I can't know which was which, I have to leave them "blank". :S ...The data are not lost, though; they are "properly" remapped to the taxa where they were going (before the code changed the target to a 0).
Looks like they map to the following taxa (by their entry ids):
907088 (Gymnodinium pulchellum, Takayama pulchella)=> [16228076, 19781811, 26305689, 34716614, 40729716, 44431149, 48595638, 20188634, 20616606, 24512469, 2651076632653886, 36419832, 45242657, 46155887, 54419292, 55413747, 60625346, 29151588, 39206587, 51383322, 26510641, 49776029, 54424593, 56187305, 56339413, 58917735, 20767877, 30056797, 35568142, 44431321, 44797497, 45728393, 48932721, 55100013, 58917297, 57071167],
900689 (Phaeocystis pouchetii) => [19415242, 26320487, 33176315, 34733532, 48623193, 40746696],
899696 (Gymnodinium punctatum) => [56340314]
That also screwed up all of the associated data: TaxonConceptName, DataObjectsTaxonConcept, TaxonConceptsFlattened, and TaxonConceptsFlattened (by ancestor), and possibly RandomHierarchyImage.
Really glad it was only four... cause that's going to be a hard mess to fix. :(
Resulting in VERY weird pages right now: http://eol.org/pages/540519/overview
...Again, will fix next week. Sorry, sorry. Bad, bad.
I bet we get away with those four pages for a weekend. So we should be conservative with our test resources for this...
Actually, I'm not aware of a way to be "conservative" in choosing a test resource for taxa merging. It's merging whatever you're using against ... everything ... so you can mess up everything.
Fortunately, I'm reasonably sure this can't happen again, and I'm reasonably sure there aren't any problems as bad as this lurking in the wings... but, of course, that may be short-sighted. :S
Okay, I think I've "properly" merged all those entries into the taxa where they should have gone (but didn't). Sigh.
Those pages (listed above) might need reindexing, though, for full effect...
Perhaps; http://eol.org/pages/540519/overview is an oops error right now.
...Which are all "not found" right now. Odd. :S They are all published and vetted, soooo... uhhh... I'm a bit confused.
Oh! Must be something other than the taxon that is "not found" ... this is a weird and unfortunate error that happens on occassion. I'll revisit after lunch, probably an easy fix (just need to check the logs for details)
No need to re-label the Oopses; I doubt they'll be catching much traffic on their own.
The new pages are fine; I fixed your links above :)
PHEW! Thanks. Sorry.
Status: so, now that the code is fixed (and the mistake cleaned up), we just need to re-run the resource... which I think we were planning on doing anyway. So I'm waiting on Anne for that (she wanted to fix a problem with references).
Shall we try the other resource, http://eol.org/content_partners/494/resources/958, in the meantime?
Sure; I'll try and get that going today... if I'm lucky... :S
It actually was harvested, believe it or not... but this was before the concept-merging code was fixed, so it may not show up in the right places (i.e.: "duplicate taxa"). I'll re-harvest it.
Hmm… I suspect yesterday's harvest merged most or all the taxa correctly! It’s just the collection that hasn’t got the news.
e.g.: http://eol.org/pages/597765/overview, a fine respectable old page, received content, but it does not appear in the collection, according to the taxon page (http://eol.org/pages/597765/communities/collections) and the collection, as best I could search it (http://eol.org/collections/118307/taxa?page=3&sort_by=3&utf8=%E2%9C%93&view_as=1). Some such taxa do appear in the collection, as well as some spurious taxa, which are unpublished.
I am cautiously optimistic!
Hmm... I don't see a reindex button on the collection page. Perhaps because of its manually-harvested state?
Oh, right: there is no button. It would be simple to add one, we should ticketize that. But you can just add /reindex to the URL of the collection to accomplish the same thing!
On Sun, Feb 21, 2016 at 12:13 PM, Jen Hammock notifications@github.com wrote:
Hmm... I don't see a reindex button on the collection page. Perhaps because of its manually-harvested state?
— Reply to this email directly or view it on GitHub https://github.com/EOL/tramea/issues/161#issuecomment-186861967.
Right! Katja taught me that trick and I forgot. But two things:
Collections do typically have a "reindex page" button on them (not sure when this was implemented). That, and the "download data" button are missing from this collection. Probably just part of the extra special state it's in.
Alas, I performed the reindex in the nav bar, it claimed to have completed almost instantly, and nothing in the list of taxa has changed, as far as I can tell.
We still have pretty good evidence that the merges worked! But if we update parter collections, we'll need the collections to direct the partners to for review; sorry...
Ooooohhhhhh... you know, I don't think we allow the reindex button on "harvest collections," now that I think of it.
I'll just manually invoke the code that will rebuild it. Sorry for the hassle... I'll post here when I've done that.
No, I wondered, so I checked: http://eol.org/collections/100883
Just in case that is a relevant clue...
On Mon, Feb 22, 2016 at 2:06 PM, Jeremy Rice notifications@github.com wrote:
Ooooohhhhhh... you know, I don't think we allow the reindex button on "harvest collections," now that I think of it.
I'll just manually invoke the code that will rebuild it. Sorry for the hassle... I'll post here when I've done that.
— Reply to this email directly or view it on GitHub https://github.com/EOL/tramea/issues/161#issuecomment-187322940.
[sigh] Looking at the code, I see there's no button because the number of items exceeds the "Collection::REINDEX_LIMIT", which is currently set to 1000.
Clear as mud. ...But there's the answer.
I'll do it manually in a bit.
just goes to show you how often I reindex collections- I never noticed... Looking forward to reindexed Feller. I will fine tooth comb the merging and report back.
As mentioned in chat, it's "reindexed".
Uh oh. All taxa are Prunus padus sstr: http://eol.org/collections/122221
The original collection still exists, but looks about the same, so I presume the changes did not manifest there: http://eol.org/collections/118307
Ha!
(I laugh only to keep from weeping.)
I will have to look into this, clearly. FWIW,. there are actually a few different species listed (if you paginate), but clearly this is very, very wrong (it shouldn't even be "possible" in the code, since I'm supposed to have a check against duplicates). Something went very wrong. More this iteration.
Fun-fun! ;(
(The collection is fixed; talked about in Gitter.)
Is this closed now?
This appears to have happened with two of the fresh resources,
http://eol.org/content_partners/494/resources/958
http://eol.org/content_partners/729/resources/969
This was a recent problem, of course, and I cannot remember if we had evidence that it was resolved, or just work done on the fix, which these harvests should have tested. @JRice, @KatjaSchulz?