Closed tripleee closed 6 years ago
cc'ing @Undo1 for console magic
I can definitely update them in bulk. Throw me a list of ids/strings and the desired transformation(s) and I'll get it done.
... Actually, looks like my query unearthed more than just the ones I had manually edited, and possibly not those at all, but still a useful diagnostic (currently 39 rows). At least besthealthdiet.com
and drozien.com
in those results look like they are an example of what I originally reported, but some of the other results are just duplicates (e.g. basij.um.ac.ir
seems to have been extracted twice from a single post somehow).
The symptoms are more complex than I thought, and there may be more than a single root cause.
There does not seem to be a need to rename things, generally speaking.
I think that where I had manually renamed the www.something
domain to just something
, this would then result in a new www.something
record to be created. Perhaps this is also the reason it appears like some domain was extracted twice from some posts, but I can't completely confirm this; there may be a separate bug behind that.
In addition, it looks like some duplicate domain records were created simply because of a race condition.
The end result is that many records are redundant and could simply be removed by myself; but others are more complex, and need some manual merging.
I've tried to order this by order of complexity, hardest first, but I might not have a full understanding of what it takes to fix them.
besthealtdiet.com
www.besthealtdiet.com
and merged with 14428osxuninstaller.com
sparechange.io
:www.btwvisas.com
:www.canonprintersdrivers.com
:www.firstmats.co.uk
:www.miracgogo.com
:There is a duplicate for (done - Art)www.timberdoorsmelbourne.net.au
but it's because there are two post records for the same post; 95607 should be deleted as it duplicates 95606 and then the domain record 16386 can be zapped.
These were so trivial I could handle them myself. I'm listing them here for reference.
basij.um.ac.ir
was extracted twice in sample 108345besthealthdiet.com
was extracted twice in samples 62920, and cdn.firebase.com
: Removed duplicate domain record 17589d2zah9y47r7bi2.cloudfront.net
: Removed duplicate domain record 17591browser-update.org
: Removed duplicate domain record 17593fast.fonts.net
: Removed duplicate domain record 17595cdn.heapanalytics.com
: Removed duplicate domain record 17598BSIpro.com
was extracted twice in sample 88467www.
) drozien.com
was extracted twice in sample 86963www.
) facts4supplement.com
was extracted twice in samples 86961 and [86992](https://metasmoke.erwaysoftware.com/post/86992]www.
) healthprograme.com
was extracted twice in 14 sampleshowtoways.com
was extracted twice in sample 85795idpro.info
was extracted twice in sample 102844jorendrasingh.me
was extracted twice in sample 107061nutritionfit.org
was extracted twice in 7 samplesreddit.com
and tumblr.com
was extracted twice in sample 103237reddit.com
tumblr.com
saintpat1985.wordpress.com
was extracted twice in sample 84394www.
) stressfreebrains.com
was extracted twice in samples 87254 and 87237www.
) supplementschoice.com
was extracted twice in sample 87226tradingbotpro.com
was extracted twice in sample 89796try-nitricstorm.com
was extracted twice in samples 86973 and 86936tryapext.com
was extracted twice in sample 70784ultraguide.net
was extracted twice in sample 105330www.
) usahealthguide.com
was extracted twice in sample 87280usahealthmarket.info
was extracted twice in sample 101343www.
) weightloss-spot.com
was extracted twice in 4 sampleswww.ingic.uk
was extracted twice in sample 87496www.pdftodo.com
was extracted twice in sample 90060www.studentstutorial.com
was extracted twice in sample 87110www.techquery.org
was extracted twice in sample 87470www.vitatalalay.com
was extracted twice in sample 91271Manual console magic should be done - see edits to above comment.
There are duplicate entries for some domains. This is otherwise rather harmless, but it prevents updating a domain record when it has a duplicate.
Back when domain tags were introduced, I sometimes changed the domain name from www.domain to just domain. This is now biting me in the rear. Can these somehow be reverted in bulk?For example, I cannot save edits to https://metasmoke.erwaysoftware.com/domains/10010
besthealthdiet.com
because it clashes with https://metasmoke.erwaysoftware.com/domains/14424(nominallywww.besthealthdiet.com
but I edited it to refer tobesthealtdiet.com
too).I'll be happy to figure out which domains exactly need this, if there is some hope that they can all be fixed.