sagemathinc / cocalc

CoCalc: Collaborative Calculation in the Cloud
https://CoCalc.com
Other
1.14k stars 207 forks source link

compute servers: race condition creating default firewall #7603

Open williamstein opened 1 month ago

williamstein commented 1 month ago

If you make a brand new google cloud project, then start building two compute servers images in parallel, there is a race condition, where they both try to make the same default firewall at the same time. Here's the log:

~/cocalc/src$ sudo su
root@prod-42:/projects/6b851643-360e-435e-b87e-f9a6ab64a8b1/cocalc/src# cd
root@prod-42:~# cd /cocalc/src/
root@prod-42:/cocalc/src# cd packages/server/
root@prod-42:/cocalc/src/packages/server# DEBUG=cocalc:* node
Welcome to Node.js v18.17.1.
Type ".help" for more information.
> a = require('./dist/compute/cloud/google-cloud/create-image'); await a.createImages({image:"python"})
***

Logging to "/cocalc/src/data/logs/log" via the debug module
with  DEBUG='cocalc:*'.
Use   DEBUG_FILE='path' and DEBUG_CONSOLE=[yes|no] to override.
Using DEBUG='cocalc:*,-cocalc:silly:*' to control log levels.

***
{
  image: 'python',
  data: {
    priority: 10,
    label: 'Python',
    package: 'sagemathinc/python',
    minDiskSizeGb: 10,
    dockerSizeGb: 1,
    gpu: false,
    icon: 'python',
    vidoes: [ 'https://youtu.be/_y5FEj9o4aY' ],
    url: 'https://www.python.org/',
    source: 'https://github.com/sagemathinc/cocalc-compute-docker/blob/main/src/python',
    versions: [ [Object], [Object] ],
    description: '[Python](https://python.org) is a versatile and user-friendly programming language, known for its clear syntax and readability. It is widely used for web development, data analysis, artificial intelligence, and scientific computing.'
  }
}
logging to  logs/cocalc-python-2024-05-18-arm64.log
logging to  logs/cocalc-python-2024-05-18.log

-----------------------
 WARNING: the following VM's were NOT deleted due to errors or options --  [ 'cocalc-python-2024-05-18-arm64', 'cocalc-python-2024-05-18' ] Note that each instance will still be automatically deleted after about 60 minutes. 
-----------------------

Uncaught:
GoogleError: The resource 'projects/cocalc-compute-dev/global/firewalls/compute-default' already exists
    at GoogleError.parseHttpError (/cocalc/src/packages/node_modules/.pnpm/google-gax@4.3.5_encoding@0.1.13/node_modules/google-gax/build/src/googleError.js:72:37)
    at decodeResponse (/cocalc/src/packages/node_modules/.pnpm/google-gax@4.3.5_encoding@0.1.13/node_modules/google-gax/build/src/fallbackRest.js:66:49) {
  details: [],
  code: 10
}

The firewall does get created and things work if you try again. Since this only might happen once the first time you setup a google cloud project, it's low priority right now.