gronlund / cvrdata

Extract data from danish CVR registry from Danish Business Authority
MIT License
7 stars 2 forks source link

Have the script finished? #5

Open emilla97 opened 2 years ago

emilla97 commented 2 years ago

Hi Allan,

I am using your script to retrieve data for a Master's Thesis. The script has been running for + 12 hours and have for the last 8 hours not changed in the terminal. I am now unsure whether It has finished or is just stuck. The last part of the terminal looks like the following

4101505it [1:55:59, 803.84it/s]session update failure (MySQLdb._exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') [SQL: INSERT INTO Organisation (enhedsnummer, hovedtype, navn, gyldigfra, gyldigtil, sidstopdateret) VALUES (%s, %s, %s, %s, %s, %s)] [parameters: ((4003832982, 'REGISTER', 'EJERREGISTER', '1903-01-01', '2016-06-07', datetime.datetime(2016, 7, 14, 13, 29, 55, tzinfo=)), (4003842070, 'REGISTER', 'EJERREGISTER', '1903-01-01', '2200-01-01', datetime.datetime(2015, 1, 9, 12, 43, 40, tzinfo=)), (4003853542, 'REGISTER', 'EJERREGISTER', '1903-01-01', '2200-01-01', datetime.datetime(2015, 1, 19, 20, 26, 21, tzinfo=)), (4004376466, 'LEDELSESORGAN', 'Direktion', '1985-12-09', '2016-10-12', datetime.datetime(2016, 10, 12, 13, 46, 24, tzinfo=)), (4004376467, 'REVISION', 'Revision', '1985-12-09', '2016-10-12', datetime.datetime(2016, 10, 12, 13, 46, 24, tzinfo=)), (4004376468, 'LEDELSESORGAN', 'Likvidator', '1985-12-09', '2016-10-12', datetime.datetime(2016, 10, 12, 13, 46, 24, tzinfo=)), (4004477026, 'STIFTERE', 'Stiftere', '1991-11-01', '2002-12-30', datetime.datetime(2015, 2, 10, 1, 0, tzinfo=)), (4004477027, 'LEDELSESORGAN', 'Bestyrelse', '1991-11-01', '2002-12-30', datetime.datetime(2015, 2, 10, 1, 0, tzinfo=)) ... displaying 10 of 316 total bound parameter sets ... (4009036082, 'LEDELSESORGAN', 'Direktion', '2021-10-01', '2200-01-01', datetime.datetime(2021, 10, 11, 16, 43, 5, tzinfo=)), (4009176882, 'REVISION', 'Revision', '2022-02-15', '2200-01-01', datetime.datetime(2022, 2, 17, 16, 16, 26, tzinfo=)))] (Background on this error at: https://sqlalche.me/e/14/e3q8) 5171470it [2:34:17, 247.81it/s]in != ud lets do something anyways 5209217it [2:36:52, 274.52it/s]in != ud lets do something anyways 6077804it [3:29:53, 482.61it/s] 2022-04-07 19:58:01,528 - producer - INFO - objects parsing done 2022-04-07 19:58:01,529 - producer - INFO - Producer Done. Exiting...61619 2022-04-07 19:58:01,529 - producer - INFO - Producer Time Used: 12618.718698978424 Producer done adding sentinels Producer Done - Adding Sentinels Adding sentinel 0 Adding sentinel 1 Adding sentinel 2 2022-04-07 19:58:15,449 - consumer-61622 - INFO - sentinel found - Thats it im out of here waiting for consumers 2022-04-07 19:58:15,457 - consumer-61620 - INFO - sentinel found - Thats it im out of here Consumer Done. Exiting...61620 - time used 2.497490882873535 Consumer Done. Exiting...61622 - time used 2.721281051635742 waiting for consumers <Process name='Process-3' pid=61621 parent=61617 started daemon">"

Looking forward for your response. I would appreciate any help!

Best regards Emil Andersen

gronlund commented 2 years ago

sorry for the very late reply. The script looks to be done, but I would have to check the logs to ensure that everything worked as it was supposed to