sfg-taxonpages / orthoptera

0 stars 0 forks source link

another OTU with missing pic #69

Closed klausriede closed 5 months ago

klausriede commented 5 months ago

https://orthoptera.speciesfile.org/otus/928913/overview

typophyllum commented 5 months ago

This is the same problem as recently: https://github.com/sfg-taxonpages/orthoptera/issues/65 It picked up the duplicate OTU 928913 of the synonym consobrina. After I deleted this it selected the duplicate OTU 929631 of carinata, still without pictures. After deleting this one too, it now finds OTU 824516 with the pictures. Very confusing.

klausriede commented 5 months ago

@mjy @debpaul @MMCigliano @LocoDelAssembly dumb OTUs seem to be a serious structural problem, see above! Could somebody please explain to the restofus how this is possible? Preferably in simple words? A database model which generates confusion must be examined profoundly!

mjy commented 5 months ago

@klausriede:

Hope this helps.

klausriede commented 5 months ago

if there is no confident mapping we have a problem...

mjy commented 5 months ago

if there is no confident mapping we have a problem...

Well yes, you pointed that out.

@klausriede You realize the data in SFG (and any system) was not perfect, correct? The structure and format of data was inconsistent in places. This is not a fault of the SFG software, this is the nature of complex tools and humans who use tools inconsistently (we are, after all, human). I'm sure, as I mentioned succinctly above, that you appreciate that when you move (think going from one new house to another) you aren't in the same place, you lost some things, you gain some things, i.e. a proportion is lost (and gained) in any move. This problem is a reflection of the nature of such moves. Given the time frame in which OSF has evolved (decades) it is frankly astonishing to me that there are not more such problems. Remember too that we have done migrations for equally complicated datasets, all different models, all coming to TW (3i, UCD, LepIndex etc.), all had edge-case issues, so OSF is not alone. If we look at the total proportion of time spent on the SF migration (months and months of work) relative to the errors encountered you'll see there was a diminishing payoff b/w effort spent and accuracy gained. Could we have spent vastly more time in migrating the last small fraction of data more "correctly"? Maybe. Can the issues be addressed, perhaps with less overall time spent now? More certainly. In time the issues you see will be resolved, and we'll move on. In time TaxonWorks will go away, and hopefully it's collective content will move along, when that happens the cycle will continue anew. We all work on a Taxonomic Timescale, demanding things be perfect yesterday is, dare I say, tiresome.

Thanks for your understanding.

MMCigliano commented 5 months ago

Thank you for sharing your observations and concerns, Klaus regarding the OSF maps. Matt has raised a valid point about the complexities inherent in data migration, especially when dealing with tools and human input. It's true that the nature of such moves involves a certain proportion of data loss and gain. The inconsistencies in structure and format can be expected in complex datasets, and Matt´s analogy to moving homes is quite apt. The extensive effort that went into the SF migration, spanning months of work, underscores the dedication to ensuring accuracy. While some errors were encountered along the way, it's important to recognize the diminishing returns in terms of effort spent versus accuracy gained. Matt´s insight into the overall proportion of time spent relative to the encountered errors provides valuable context. As we continue to address these issues, I assure you that our OSF team is committed to resolving them efficiently. Moreover, I want to acknowledge that the display of maps in the old OSF system was not without its share of errors. The integrated GIS tools in TW represents a substantial improvement over the capabilities of the old Species File system. Your insights are valuable as we strive to make the transition to TaxonWorks the best it can be.

klausriede commented 5 months ago

Thank you! What worried me is that I did not spend much time diagnosing the errors, these occured during routine crosscheckings when I came along something interesting on inaturalist. As to migrations we are just experiencing a major catastrophe in a subbranch of Deutsche Bank (Postbank ) here in Germany, users blocked etc. It is therefore important for us to understand the differences in data structure, in particular the algorithm leading to numb OTUs. I must admit that I still don´t understand the OTUs in this database context, evidently not meant in a biological sense. That said it would be important (for me) to add temporarily defined potential species (types of tomorrow), as found in several theses and our papers from Panama (mainly mOTUs). Speaking about maps I think it is important to have a caption in the overview maps (not for the species points which are self explanatory), and references where the info comes from if it is at country level. Some nice map caveats can be found in old osf. I think meanwhile the new TP interface improved considerably, but I have not worked with TW - definitely another topic. Matt, I still am curious to learn more about "the algorithm" you mentioned. Aspera ad astra! Best wishes Klaus

klausriede commented 5 months ago

PS my remark about confident mapping at the beginning of the thread did not refer to (GIS) maps!