Open Woseseltops opened 4 years ago
Thumbs up!
Can you think of anything that will go wrong @susanodd ?
Recall we've had problems getting the test database to work. There were some essential tables that needed to remain non-empty, and others that connect objects in different tables. Perhaps first try exporting to see what the SQL creation commands look like that will refill the tables? There was also a lot of Django-specific stuff that had accumulated. Recall that you discovered this when we first started testing? You deleted gigabytes of junk. The migrations have very old "initialization" stuff in them (when the choice lists were first created). Some of that is actually still in the Django part. Later we made a table for field choices. But before that, this was inside the python migrations that creates stuff. Recall that when we make migrations now, we delete tons of stuff from the automatically generated ones.
It seems like it's important to make sure a bunch of obsolete Django stuff isn't carried over into a fresh database. And that the migrations are pruned down. We only want the newest database, not an evolution derived structure.
Recall we've had problems getting the test database to work. There were some essential tables that needed to remain non-empty, and others that connect objects in different tables.
Ah right, good point @susanodd , there is a test-database, which is a light-weight version of the full database! So I should request two databases.
Recall that when we make migrations now, we delete tons of stuff from the automatically generated ones. And that the migrations are pruned down. We only want the newest database, not an evolution derived structure.
Ah yes doing database migrations is not super clean these days. In the past, we have at some point 'deleted' previous migrations and started fresh. I think you are right that this is good time for such a restart to prevent problems. Something like this: https://simpleisbetterthancomplex.com/tutorial/2016/07/26/how-to-reset-migrations.html
Fun fact: just made a test dump of the database (output is json), and it's almost 1 GB !
Gosh!
I just read the how to reset migrations. I'm wondering now about the Django Guardian stuff?
@Woseseltops , what kind of output are you generating for export? You wrote json. Is it possible to just use a database viewer program and export the tables as SQL insert commands, and the SQL tables as CREATE commands? There are a lot of tables. But if they are exported as SQL commands, then the data should retain its type. And if all the tables are exported from a single database, then all the inter-table (object) references should be correct.
Perhaps we need to write some sort of "diff" comparisons to compare the contents of both the old and new databases. Or just say python to get all the objects from each model and write out .dict's on everything, and compare those.
Thanks a lot for thinking along @susanodd !
Is it possible to just use a database viewer program and export the tables as SQL insert commands, and the SQL tables as CREATE commands?
Technically yes, that is the raw, low level way of doing it. What I am now experimenting with is a Django tool for things like this develop.py dumpdata
and develop.py loaddata
. Advantage of this would be that all things that will go wrong for everybody when migrating are already taken care of, like translation errors between Sqlite and MySQL. Can you think of an advantage of the low level way? I was thinking speed, but the exporting part took only a few minutes.
But if they are exported as SQL commands, then the data should retain its type. And if all the tables are exported from a single database, then all the inter-table (object) references should be correct.
I think all of this is also covered by the 5-step workflow I describe above?
Perhaps we need to write some sort of "diff" comparisons to compare the contents of both the old and new databases. Or just say python to get all the objects from each model and write out .dict's on everything, and compare those.
As some sort of test you mean, to check if we did not lose data?
Update: I tried not listening to @susanodd, link the new MySQL database and do a develop.py migrate
, but it results in a classic circular Signbank problem where it tries to create the database structure, but for that to work it already needs the structure to be in place. I'm going to try raw SQL next.
This morning I tried the other approach, setting up the database structure already by making a schema structure export in SQLite, and then run this SQL in the new MySQL database... but I ran into various issues related to the differences between SQLite and MySQL. Some of these I could fix (MySQL does not seem to like double quotes, and calls AUTOINCREMENT AUTO_INCREMENT), but there were also a lot of things that I did not know how to fix.
I guess I go back to fixing the circular errors then.
Okay, fixing the circular errors was not that bad. In all but one cases, the problematic code was already in a try ... except OperationalError
... which just did not work in this case because now it counted as a django.db.utils.ProgrammingError
instead of an OperationalError
. I've reopened the issue where we tried to fix the circular errors so we can extend the solution: #564
After that, the database wanted to be created! I'm now trying to develop.py loaddata
. The first thing I ran into was that during the creation of the tables Django already puts some data in there, which then create a conflict when importing other data. Fortunately I already read here: https://www.shubhamdipt.com/blog/django-transfer-data-from-sqlite-to-another-database/, that you can simply fix this with a
from django.contrib.contenttypes.models import ContentType
ContentType.objects.all().delete()
Now I get: Could not load sites.Site(pk=1): (1062, "Duplicate entry 'example.com'
... would this also be data created by Django that can be removed?
It sounds like Django data, since Site is a Django concept. That's weird there's an example.com.
It sounds like chicken or egg.
After first creating the new database, before importing the old stuff, can you export the new database and check what's in it? The linked article's database may not be as extensive as Signbank. There may be additional stuff that needs to be deleted.
Your thinking was correct! When creating a new database structure with migrate
, Django already puts some content types and and example site in the database. If you remove these again manually, the problem goes away. This is what I did to remove the example site
from django.contrib.sites.models import Site
Site.objects.all().delete()
Next problem was that there apparently are some duplicates in the dump json dat MySQL does not like. I'm going to write a small script to identify them automatically.
@Woseseltops Sharing my experience: the live instance of Mediate uses the same MySQL-server. I also have a development instance on another server with SQLite that is on the scratch disk (symlinked from writable/database). Here is the thing: this development instance seems faster than the live instance.
Hmm... that is indeed good to know. Although I wouldn't have expected it, I can retroactively come up with a reason: MySQL adds some overhead in the form of a separate process that needs to send data over sockets, while SQLite always reads the data directly from disk. The advantage of MySQL is that it is optimized for larger datasets (I think), so there must be some dataset size threshold where the MySQL's speed outweighs its overhead. Do you think this could be true @vanlummelhuizen ?
@Woseseltops Honestly, I don't know. For Mediate there are over half a million records for some models, far more than any Signbank model (I think).
MySQL may be more stable than SQLite, so I am not against using it. The speed of that specific MySQL server may not be higher than SQLite from local disk though. Perhaps a local MySQL server in a container setup may help.
Of course the number of queries and indexing may help speed. But I think we should discuss that in #504.
Problems that need to be fixed in the datadump:
I for now fixed this by removing problem objects, but we will need to investigate what these objects are before we do the migration for real.
Currently stuck at this error message:
django.db.utils.IntegrityError: Problem installing fixture '/var/www/signbank/live/writable/database/dump_cleaned.json': Could not load dictionary.Keyword(pk=2127): (1062, "Duplicate entry 'OK' for key 'text'")
As far as I can tell, there is only 1 'OK' keyword
It took a loooot of patience, but now Django agrees to load the data in the fixture into the MySQL database. More problems I ran into:
ALTER TABLE dictionary_keyword CHANGE 'text' 'text' VARCHAR(100) CHARACTER SET utf8 COLLATE utf8_bin;
'gloss': 0
in the whole dataset, but apparently I also needed to search for 'parent_gloss':0
. So far I simply removed all database problem items. I have given it a bit more thought:
These are multiple translation objects that link the same gloss to the same keyword for the same language. I have investigated, and this is really garbage that can be removed:
I will create a separate script that can import existing user profile data into newly created user profiles.
This is https://signbank.science.ru.nl/dictionary/gloss/0/ JAN-WILLEM-VAN-MANSVELT. I guess we can remove this one and recreate it, hoping that it will get a valid private key this time? We also need to rewire all morphology, translations, frequency data and videos that refer to the pk=0 gloss:
{'fields': {'index': 0, 'language': 2, 'gloss': 0, 'translation': 3233}, 'model': 'dictionary.translation', 'pk': 5777}
{'fields': {'parent_gloss': 0, 'role': '2', 'morpheme': 3817}, 'model': 'dictionary.morphologydefinition', 'pk': 542}
{'fields': {'parent_gloss': 0, 'role': '3', 'morpheme': 1028}, 'model': 'dictionary.morphologydefinition', 'pk': 543}
{'fields': {'text': 'JAN-WILLEM-VAN-MANSVELT', 'gloss': 0, 'language': 2}, 'model': 'dictionary.annotationidglosstranslation', 'pk': 1634}
{'fields': {'text': 'JAN-WILLEM-VAN-MANSVELT', 'gloss': 0, 'language': 1}, 'model': 'dictionary.annotationidglosstranslation', 'pk': 6115}
{'fields': {'gloss': 0, 'document': 1195, 'speaker': 26, 'frequency': 3}, 'model': 'dictionary.glossfrequency', 'pk': 18892}
{'fields': {'version': 0, 'gloss': 0, 'videofile': 'glossvideo/NGT/JA/JAN-WILLEM-VAN-MANSVELT-0.mp4'}, 'model': 'video.glossvideo', 'pk': 1483}
The 'OK' problem mentioned above, which was caused by MySQL comparing string non-case sensitive by default. I fixed it by making this specific column case sensitive:
ALTER TABLE dictionary_keyword CHANGE 'text' 'text' VARCHAR(100) CHARACTER SET utf8 COLLATE utf8_bin;
@Woseseltops Do you know that in MySQL utf8 is not really utf8. MySQL-utf8 can hold max 3 bytes while real utf8 can be 4 bytes. utf8mb4 is MySQL's real utf8. See for example: https://www.eversql.com/mysql-utf8-vs-utf8mb4-whats-the-difference-between-utf8-and-utf8mb4/
No I didn't know that @vanlummelhuizen ! I've looked it up and UTF-8 uses 2-3 bytes for ASCII characters, but I'm not so sure for the Chinese characters. Sounds like it would be a good idea to use real UTF-8 by default. It's not a simple thing to do, apparently, however :( . https://dba.stackexchange.com/questions/8239/how-to-easily-convert-utf8-tables-to-utf8mb4-in-mysql-5-5
Okay, the script to install user profiles is working. Next up: creating a command that converts all tables and columns to real UTF-8
Note to self: useful overview of all columns: select table_name,column_name,character_set_name from information_schema.columns;
There is a lot of stuff with weird encodings in the GlossVideo table. When moving to MySQL can something be done about this?
With special utf-8 characters in filenames, as well as filenames with percent encoding, it causes problems to upload a new video. The code silently "does nothing" about the paths it can't handle. It's not possible to get rid of stuff in GlossVideo, so it keeps finding the "first" object in the list and not being able to do anything with its path. (#674)
(This is causing a problem with Farsi, also because concatenation works backwards.)
Look at this MySQL-monster:
ALTER TABLE auth_user_user_permissions CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_gloss_morphemePart CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE feedback_generalfeedback CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_keyword CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_lemmaidglosstranslation CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE django_summernote_attachment CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_blendmorphology CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_corpus CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE feedback_signfeedback CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tagging_taggeditem CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE pages_page CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE video_glossvideo CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE attachments_attachment CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_gloss CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_language CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE auth_user CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_annotationidglosstranslation CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_relation CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE video_video CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_glossfrequency CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE feedback_missingsignfeedback CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tagging_tag CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_gloss_dialect CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE django_session CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_morpheme CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE django_admin_log CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE auth_permission CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE guardian_userobjectpermission CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_gloss_creator CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE auth_group CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_dataset_owners CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_gloss_signlanguage CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_signlanguage CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_speaker CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_userprofile CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE django_site CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE pages_page_group_required CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE feedback_interpreterfeedback CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE reversion_version CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_simultaneousmorphologydefinition CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE pages_pagevideo CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE guardian_groupobjectpermission CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_morphologydefinition CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_userprofile_selected_datasets CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_othermedia CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_deletedglossormedia CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_dataset CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE auth_user_groups CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE django_content_type CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE auth_group_permissions CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE django_migrations CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_document CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE reversion_revision CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_dataset_exclude_choices CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_glossrevision CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_definition CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_fieldchoice CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_lemmaidgloss CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE video_glossvideohistory CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_translation CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_dialect CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_relationtoforeignsign CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_handshape CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE dictionary_dataset_translation_languages CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
According to the StackOverflow answer I'm using all columns should also be converted individually, making the command even larger. However, if I use select table_name,column_name,character_set_name,collation_name,column_type from information_schema.columns;
everything is looking perfect already, so I think this can be skipped.
OMG!
Okay the database is UTF-8 ready. Next week, I will:
Recreate the gloss for Jan-Willem Mansveld. See if a newly created gloss after that does not get pk 0. Rewire all related objects.
Done, all seems okay. We now have https://signbank.science.ru.nl/dictionary/gloss/38194/
I've added a database switch to the settings, so I can quickly switch from MySQL and Sqlite (with the latter being the real, up to date database).
Unfortunately, when I switch to MySQL, Signbank does not even want to start. I get this error:
ImportError: No module named 'MySQLdb.constants'
Which is weird, my MySQLdb is definitely there. I know this, because I can use MySQL-related functionality (like loading data into it), and also this test strongly suggests it:
(sb-env) [wstoop@applejack:/var/www/signbank/live]$ python
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
=
>>> import MySQLdb
>>>
Any idea what's going on here, @vanlummelhuizen and @susanodd ?
@Woseseltops I have tried some stuff and googled for it, but unfortunately I have not found a possible solution.
Turns out there are multiple Python MySQL packages for different Python versions, that have different names when using pip vs import; Django seems to support them all (I think). An attempt to take away my own confusion:
pip install | import | note |
---|---|---|
MySQL-python | MySQLdb | Python 2 only |
mysqlclient | MySQLdb | fork of the previous |
PyMySQL | pymysql | pure Python, has function to 'act as MySQLdb' (?) |
Today, tried to install PyMySQL instead of MySQL, using its option 'install_as_MySQLdb'. Like with the other package, it works when you try it separately (in the Django shell), but when I run the actual application, I get:
AttributeError: module 'pymysql' has no attribute 'install_as_MySQLdb'
As similar kind of error, like it didn't load the module correctly or something. I'm now going to test the hypothesis that it's related to uwsgi/Emperor.
Ha! It's emperor related! If I run the uwsgi-process by hand, I get a new error Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock
! Progress!
What's the status of this issue, @Woseseltops ?
Can this be closed?
@susanodd I think using MySQL is not yet possible. @Woseseltops correct?
@Woseseltops Any progress here?
Any progress here? (more than a year after the same question...)
In the light of moving to a database server, there are concerns that many API calls may (b)lock the SQLite database (#1331, #1332). Perhaps, for now, we could do some SQLite optimization as decribed in https://blog.pecar.me/sqlite-django-config. What do you think @Woseseltops ?
Sorry for missing your question last year @vanlummelhuizen ! This issue has ended up very low on my todo list in 2020, so no progress here. Given the funding situation for Signbank after 2024 is unclear, it's probably unwise to take up major new projects, so SQLite optimization is probably the better choice indeed; didn't know something like that is possible!
@Woseseltops , @vanlummelhuizen See this:
https://github.com/Signbank/Global-signbank/issues/1331#issuecomment-2407281734
There are repeated sequences in the signbank.uwsgi.log
log.
A package retrieval, then 8 update gloss calls in the API, then the database is locked. This happens within one minute.
That the database is locked triggers other errors because the url requests keep coming.
Okay, I found a sequence of 32 gloss updates before the database locked error happened.
[pid: 848|app: 0|req: 27170/59682] 10.208.155.251 () {60 vars in 1232 bytes} [Thu Oct 10 20:11:26 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/DISNEY+-C.zip => generated 65 bytes in 9 msecs (HTTP/1.1 200) 5 headers in 147 bytes
(1 switches on core 0)
[pid: 848|app: 0|req: 27171/59683] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:26 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 8 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 3)
[pid: 848|app: 0|req: 27172/59684] 10.208.155.251 () {60 vars in 1258 bytes} [Thu Oct 10 20:11:27 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/ALPHEN-AAN-DEN-RIJN%20.zip => generated 65 bytes in 9 msecs (HTTP/1.1 200) 5 headers
in 147 bytes (1 switches on core 1)
[pid: 844|app: 0|req: 16383/59685] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:27 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 10 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 1)
Is there any reason why the filename is the gloss annotation rather than something like "video_archive.zip" ?
[pid: 848|app: 0|req: 27177/59702] 10.208.155.251 () {60 vars in 1206 bytes} [Thu Oct 10 20:11:31 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/ => generated 122591 bytes in 149 msecs (HTTP/1.1 500) 5 headers in 178 bytes (1 swi
tches on core 0)
[pid: 839|app: 0|req: 6037/59703] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:31 2024] GET /dictionary/upload_videos_to_gl
osses/5/ => generated 29 bytes in 9 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 0)
[pid: 840|app: 0|req: 10096/59704] 10.208.155.251 () {60 vars in 1232 bytes} [Thu Oct 10 20:11:31 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/DISNEY+-A.zip => generated 65 bytes in 8 msecs (HTTP/1.1 200) 5 headers in 147 bytes
(1 switches on core 0)
[pid: 848|app: 0|req: 27178/59705] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:31 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 8 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 2)
[pid: 848|app: 0|req: 27179/59706] 10.208.155.251 () {60 vars in 1234 bytes} [Thu Oct 10 20:11:32 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/INFLATIE-B.zip => generated 102 bytes in 129 msecs (HTTP/1.1 200) 5 headers in 148 b
ytes (1 switches on core 0)
[pid: 839|app: 0|req: 6038/59707] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:32 2024] GET /dictionary/upload_videos_to_gl
osses/5/ => generated 200 bytes in 32 msecs (HTTP/1.1 200) 5 headers in 183 bytes (3 switches on core 3)
[pid: 844|app: 0|req: 16395/59708] 10.208.155.251 () {60 vars in 1232 bytes} [Thu Oct 10 20:11:32 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/SAMEN+2-B.zip => generated 65 bytes in 9 msecs (HTTP/1.1 200) 5 headers in 147 bytes
(1 switches on core 2)
[pid: 848|app: 0|req: 27180/59709] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:32 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 8 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 1)
[pid: 840|app: 0|req: 10097/59710] 10.208.155.251 () {60 vars in 1232 bytes} [Thu Oct 10 20:11:33 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/SAMEN+3-B.zip => generated 65 bytes in 7 msecs (HTTP/1.1 200) 5 headers in 147 bytes
(1 switches on core 3)
[pid: 844|app: 0|req: 16396/59711] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:33 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 8 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 3)
[pid: 840|app: 0|req: 10098/59712] 10.208.155.251 () {60 vars in 1232 bytes} [Thu Oct 10 20:11:33 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/SAMEN+4-B.zip => generated 65 bytes in 10 msecs (HTTP/1.1 200) 5 headers in 147 byte
s (1 switches on core 1)
[pid: 848|app: 0|req: 27181/59713] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:33 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 11 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 0)
File "/usr/lib/python3.12/urllib/request.py", line 639, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
[pid: 848|app: 0|req: 27188/59728] 10.208.155.251 () {60 vars in 1206 bytes} [Thu Oct 10 20:11:37 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/ => generated 122591 bytes in 72 msecs (HTTP/1.1 500) 5 headers in 178 bytes (1 swit
ches on core 3)
[pid: 848|app: 0|req: 27189/59729] 10.208.155.251 () {60 vars in 1109 bytes} [Thu Oct 10 20:11:37 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 8 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 1)
Internal Server Error: /dictionary/upload_zipped_videos_folder_json/5/
Traceback (most recent call last):
File "/var/www/env/lib/python3.12/site-packages/django/core/handlers/exception.py", line 55, in inner
response = get_response(request)
^^^^^^^^^^^^^^^^^^^^^
File "/var/www/env/lib/python3.12/site-packages/django/core/handlers/base.py", line 197, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/www/repo/signbank/api_token.py", line 69, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/www/repo/signbank/api_interface.py", line 327, in upload_zipped_videos_folder_json
with urllib.request.urlopen(zipped_file_url) as response:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/urllib/request.py", line 215, in urlopen
return opener.open(url, data, timeout)
It looks like some of the zipfile urls are missing the actual zip file.
[pid: 848|app: 0|req: 27674/60814] 10.208.155.251 () {60 vars in 1206 bytes} [Fri Oct 11 00:10:54 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/ => generated 122591 bytes in 102 msecs (HTTP/1.1 500) 5 headers in 178 bytes (1 swi
tches on core 0)
[pid: 848|app: 0|req: 27675/60815] 10.208.155.251 () {60 vars in 1109 bytes} [Fri Oct 11 00:10:54 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 8 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 2)
[pid: 839|app: 0|req: 6135/60816] 10.208.155.251 () {60 vars in 1230 bytes} [Fri Oct 11 00:10:54 2024] GET /dictionary/upload_zipped_video
s_folder_json/5/?file=https://www.gomerotterspeer.nl/WEEK+1-A.zip => generated 65 bytes in 9 msecs (HTTP/1.1 200) 5 headers in 147 bytes (
1 switches on core 0)
[pid: 848|app: 0|req: 27676/60817] 10.208.155.251 () {60 vars in 1109 bytes} [Fri Oct 11 00:10:54 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 10 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 1)
[pid: 839|app: 0|req: 6136/60818] 10.208.155.251 () {60 vars in 1230 bytes} [Fri Oct 11 00:10:55 2024] GET /dictionary/upload_zipped_video
s_folder_json/5/?file=https://www.gomerotterspeer.nl/WEEK+2-A.zip => generated 65 bytes in 10 msecs (HTTP/1.1 200) 5 headers in 147 bytes
(1 switches on core 2)
[pid: 848|app: 0|req: 27677/60819] 10.208.155.251 () {60 vars in 1109 bytes} [Fri Oct 11 00:10:55 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 9 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 0)
[pid: 840|app: 0|req: 10305/60820] 10.208.155.251 () {60 vars in 1230 bytes} [Fri Oct 11 00:10:55 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/WEEK+3-A.zip => generated 65 bytes in 9 msecs (HTTP/1.1 200) 5 headers in 147 bytes
(1 switches on core 0)
[pid: 848|app: 0|req: 27678/60821] 10.208.155.251 () {60 vars in 1109 bytes} [Fri Oct 11 00:10:55 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 11 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 3)
[pid: 839|app: 0|req: 6137/60822] 10.208.155.251 () {60 vars in 1230 bytes} [Fri Oct 11 00:10:56 2024] GET /dictionary/upload_zipped_video
s_folder_json/5/?file=https://www.gomerotterspeer.nl/WEEK+4-A.zip => generated 65 bytes in 9 msecs (HTTP/1.1 200) 5 headers in 147 bytes (
1 switches on core 3)
[pid: 848|app: 0|req: 27679/60823] 10.208.155.251 () {60 vars in 1109 bytes} [Fri Oct 11 00:10:56 2024] GET /dictionary/upload_videos_to_g
losses/5/ => generated 29 bytes in 8 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 2)
[pid: 848|app: 0|req: 27680/60824] 10.208.155.251 () {60 vars in 1230 bytes} [Fri Oct 11 00:10:56 2024] GET /dictionary/upload_zipped_vide
os_folder_json/5/?file=https://www.gomerotterspeer.nl/WEEK+5-A.zip => generated 65 bytes in 10 msecs (HTTP/1.1 200) 5 headers in 147 bytes
(1 switches on core 1)
[pid: 839|app: 0|req: 6138/60825] 10.208.155.251 () {60 vars in 1109 bytes} [Fri Oct 11 00:10:56 2024] GET /dictionary/upload_videos_to_gl
osses/5/ => generated 29 bytes in 10 msecs (HTTP/1.1 200) 5 headers in 183 bytes (2 switches on core 0)
Internal Server Error: /dictionary/upload_zipped_videos_folder_json/5/
Those above all worked. The ones that caused errors the actual zip file was not in the file argument, just the website prefix 'file=https://www.gomerotterspeer.nl/'
Sorry, the comments here are relevant to the API. And the traffic. And the database getting locked. There is zero time between the requests.
In this issue I want to investigate what should happen for us to move to MySQL so we can get rid of #441 . This is what I could come up with.
Phase 1: preparation
Phase 2: migration
migrate
so the MySQL database has the correct structureExpected problems:
@susanodd and @vanlummelhuizen , I really need your help here :)