WebarchivCZ / Seeder

Seeder - Czech webarchive curating tool and public site
MIT License
15 stars 2 forks source link

data migration fail #431

Closed kvasnicaj closed 7 years ago

kvasnicaj commented 7 years ago

podle všeho si při vytváření nové databáze použil rok starý dump...

westfood commented 7 years ago

to je čudné.. dumpoaval jsem to z tohle souboru..

[rudolf@wa-db00 tmp]$ bzcat wadmin-2017-08-17.sql.bz2 |head -n 20
-- phpMyAdmin SQL Dump
-- version 3.3.6
-- http://www.phpmyadmin.net
--
-- Počítač: localhost
-- Vygenerováno: Středa 23. srpna 2017, 12:56
-- Verze MySQL: 5.0.95
-- Verze PHP: 5.2.14
westfood commented 7 years ago

kouknu se na to, jak jsi na to přišel?

kvasnicaj commented 7 years ago

my jsme to zkoumali a poslední zkatalogizovaný zdroj je z prosince 2016... a v naší databázi je to srpen 2017. Tak jestli se něco neposralo při migraci teda.

westfood commented 7 years ago

koukal jsem tady:

https://seeder.webarchiv.cz/seeder/source/list?sort=-created

např. https://seeder.webarchiv.cz/seeder/source/detail/10437

Created at | 4 měsíce v minulosti

ale zkusím to znova přemigrovat načisto odpoledne až budu v hostíku

kvasnicaj commented 7 years ago

aha tak to nebude starej dump, protože jsou v Seederu zdroje i novější. U některých ale chybí údaj o katalogizaci (aleph_id) a číslo na www se dost liší. Současný stav na www.webarchiv.cz : 5152 a app.webarchiv.cz ukazuje 4246

westfood commented 7 years ago

ok, takže odpoledne to znova přemigruju a pastnu sem ten log.. jestli nebudou nějaké chyby..

kvasnicaj commented 7 years ago

třeba tento zdroj: https://seeder.webarchiv.cz/seeder/source/detail/10189 nemá aleph_id, ale měl by mít

westfood commented 7 years ago

ok, imho to bude na visgeana, ale pastnu se postup migrace a logy, ať má s čím pracovat..

Visgean commented 7 years ago

hmm ok

Visgean commented 7 years ago

hmm ok, takze je spatne zmigrovane aleph_id nebo i neco dalsiho?

kvasnicaj commented 7 years ago

Na první pohled to vypadá jen na to aleph_id. Ale těžko to takto narychlo vyzkoumat. Děsí nás ten rozdíl v počtu, který hlásí u 'Celkem jsme s autory uzavřeli' na homepage...

Visgean commented 7 years ago

Ja myslim, ze tady se to proste bude muset projit v ramci databaze. Budu potrebovat ten dump tery pouzivate.

kvasnicaj commented 7 years ago

poslal jsem přes Slack

westfood commented 7 years ago

vypadá to standardně:

(seeder) -bash-4.2$ /opt/Seeder/Seeder/manage.py legacy_sync
UserConversion
----------=======================================================================]: UserConversion
PublisherConversion
----------=======================================================================]: PublisherConversion
ContactsConversion
[================================================================================]: ContactsConversion
Skipped objects: 6731, 8850
----------
ConspectusConversion
----------=======================================================================]: ConspectusConversion
SubConspectusConversion
----------=======================================================================]: SubConspectusConversion
ResourceConversion
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2012-05-02 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-06-03 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2017-02-02 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-06-22 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2017-01-20 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-17 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-10 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-13 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-09 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-08 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-14 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-07 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-28 00:00:00) while time zone support is active.
  RuntimeWarning)
Could not parse date not exist---------------------------------------------------]: ResourceConversion
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-06-04 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-15 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2013-12-17 00:00:00) while time zone support is active.
  RuntimeWarning)
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-07-27 00:00:00) while time zone support is active.
  RuntimeWarning)
Could not parse date not exist==============================---------------------]: ResourceConversion
/opt/virtualenv/seeder/lib/python3.4/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Source.screenshot_date received a naive datetime (2015-06-05 00:00:00) while time zone support is active.
  RuntimeWarning)
----------=======================================================================]: ResourceConversion
RatingRoundConversion
----------=======================================================================]: RatingRoundConversion
VoteConversion
[================================================================================]: VoteConversion
Skipped objects: 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148
----------
SeedConversion
----------=======================================================================]: SeedConversion

ContractConversion
Skipped contracts:  [4361]=======================================================]: ContractConversion
Broken parent relationships:  []

Skipped objects: 4361
----------
QAConversion
----------=======================================================================]: QAConversion
KeyWordConversion
----------=======================================================================]: KeyWordConversionsion
Visgean commented 7 years ago

Imho to bude potreba porovnat kolik je v db seminek a zdroju.

kvasnicaj commented 7 years ago

seeder říká: Zdroje, 10436 záznamů

count na tabulku resources ve stare db: 10436

kvasnicaj commented 7 years ago

hele po té druhé migraci, už některé ty chybné zdroje mají aleph-id. Zítra to ještě zkontrolujeme s holkama, třeba to byl jen jednorázový fail