Knowledge-Graph-Hub / kg-obo

A package to transform all OBO ontologies into KGX TSV format and OBO json, and put the transformed graph in KGhub
https://knowledge-graph-hub.github.io/kg-obo/getting_started.html
GNU General Public License v3.0
28 stars 2 forks source link

Illegal instruction (core dumped) during build #179

Open caufieldjh opened 2 years ago

caufieldjh commented 2 years ago

Describe the bug

Builds are failing with the error Illegal instruction (core dumped) - though not always at the same point.

To Reproduce

In Jenkins build 160, this error appears near the end of the transform step, after all ontologies have been checked for new versions and the root index is updated:

...
15:34:40  Looking for kg-obo/mi/index.html
15:34:40  Found kg-obo/mi/index.html
15:34:40  Looking for kg-obo/geo/index.html
15:34:40  Found kg-obo/geo/index.html
15:34:40  Looking for kg-obo/mfmo/index.html
15:34:46  INFO:kg-obo:Updated root index at kg-obo
15:34:46  INFO:kg-obo:Removed local data from data.
15:34:50  Illegal instruction (core dumped)

In build 159, this happens immediately at the beginning of the transform step:

15:00:32  + python3.9 run.py --bucket kg-hub-public-data --no_dl_progress --force_index_refresh
15:00:42  Illegal instruction (core dumped)

In build 158, this happens as in build 160:

15:34:40  Looking for kg-obo/mi/index.html
15:34:40  Found kg-obo/mi/index.html
15:34:40  Looking for kg-obo/geo/index.html
15:34:40  Found kg-obo/geo/index.html
15:34:40  Looking for kg-obo/mfmo/index.html
15:34:46  INFO:kg-obo:Updated root index at kg-obo
15:34:46  INFO:kg-obo:Removed local data from data.
15:34:50  Illegal instruction (core dumped)

In transform.py, the only remaining operation after updating the root index and removing local data is to try to set the lock file, so that could be the trigger here (at least for 158 and 160). The lock file is checked at the beginning of each set of transforms, so that could be the cause of the error in 159. Not sure why it wouldn't happen in the others.

Check on kg_obo.upload.check_lock and kg_obo.upload.set_lock. I wouldn't expect an S3 error to manifest as an "illegal instruction" but maybe it's due to some kind of illegal bucket access.

Version

eccbe215fe122e45b71968cc49b48777d16c1032

caufieldjh commented 2 years ago

This appears related but unanswered: https://stackoverflow.com/questions/63415400/illegal-instruction-core-dumped-docker-red-hat-7-7-aws-ec2

caufieldjh commented 2 years ago

Should see if this completes on local machine.

caufieldjh commented 2 years ago

I frequently get the RequestTimeTooSkewed error when modifying S3 objects through boto3, like below:

creating lock file s3_bucket:kg-hub-public-data, s3_path:kg-obo/lock
Encountered error in setting lockfile on S3: An error occurred (RequestTimeTooSkewed) when calling the PutObject operation: The difference between the request time and the current time is too large.
Could not set lock file on remote server. Exiting...

I wonder if that's what's happening here, and it somehow becomes an Illegal instruction?

Anyway, the solution is to update the system time:

sudo ntpdate time.nist.gov
caufieldjh commented 2 years ago

Transform completes as expected on local machine.

The time in a Docker container should be the same as on the host machine, but that doesn't mean the host time hasn't drifted.