Open Desperationis opened 3 years ago
Replays show nothing special. When run in the same seed as their JS counterparts, they result in the exact same behavior.
I've attached the simple
bot I've been trying to compile below. The exact command I used was sudo lux-ai-2021 main.cpp main.cpp
simple.zip
I should mention that I'm on Linux Mint btw, so this error might not appear on Windows
After rigorous testing, I am more confused than I was before. I spammed that command on a specific seed for a single bot and it still failed every so often, meaning the chance of this issue occurring is completely random and not dependent on seed or the bot run. Wth
The ETXTBSY
error is unrelated to the game engine and is something else. Works fine on my machine. Please stay tuned to this thread, may ask you to test some things.
Just making sure this isn't stale, is this still and issue @Desperationis ?
It's still a issue, not sure what's causing it. What I know for certain though is that running the lux-ai command multiple times on the same seed will eventually run the match
Oh ok so this error means you are trying to modify an executable while it is already running which is a weird error.
Do you by any chance have the executable file opened for some reason.
Additionally can you compile your C++ as you would and then execute from the command line yourself and not via lux ai (it should just hang)
No, I don't. In fact, even if I run the executable compiled with g++ wiht lux-ai, it has no effect on the behaviour
@Desperationis Ok so this issue is not even consistently happening either, hard to reproduce, which leads me to think there's something up with your setup.
Can you tell me what node and npm version you are on and what linux mint version you are on?
I think worst case, use docker to run matches (we can provide a simple script for this) but hopefully that won't be necessary.
I'm running node v14.17.2 and npm v7.19.1 on Linux Mint Cinnamon Uma, though this happened on Ulyssa as well.
still not sure sorry.
I found this thread where someone else was using a nodejs application for something else and got the same error: https://github.com/alixaxel/chrome-aws-lambda/issues/69
Still not exactly sure what's going on. Maybe permissions? In which case can you do ls -la
in the folder with the main.cpp file?
This is before running lux with main.cpp:
total 44
drwx------ 4 diego diego 4096 Jul 23 20:30 .
drwxr-xr-x 16 diego diego 4096 Jul 24 12:46 ..
-rwxr-xr-x 1 diego diego 105 Jul 23 20:30 compile.bat
-rwxr-xr-x 1 diego diego 118 Jul 23 20:30 compile.sh
drwx------ 3 diego diego 4096 Jul 23 20:30 internals
drwx------ 3 diego diego 4096 Jul 23 20:30 lux
-rw-rw-r-- 1 diego diego 4441 Jul 23 20:30 main.cpp
-rw-rw-r-- 1 diego diego 1928 Jul 23 20:30 main.py
-rw-rw-r-- 1 diego diego 265 Jul 23 20:30 package.json
-rw-rw-r-- 1 diego diego 383 Jul 23 20:30 package-lock.json
After a successful run right after:
total 212
drwx------ 6 diego diego 4096 Jul 24 12:48 .
drwxr-xr-x 16 diego diego 4096 Jul 24 12:46 ..
-rwxr-xr-x 1 diego diego 105 Jul 23 20:30 compile.bat
-rwxr-xr-x 1 diego diego 118 Jul 23 20:30 compile.sh
drwxrwxr-x 2 diego diego 4096 Jul 24 12:48 errorlogs
drwx------ 3 diego diego 4096 Jul 23 20:30 internals
drwx------ 3 diego diego 4096 Jul 23 20:30 lux
-rw-rw-r-- 1 diego diego 4441 Jul 23 20:30 main.cpp
-rwxrwxr-x 1 diego diego 162032 Jul 24 12:48 main.out
-rw-rw-r-- 1 diego diego 1928 Jul 23 20:30 main.py
-rw-rw-r-- 1 diego diego 265 Jul 23 20:30 package.json
-rw-rw-r-- 1 diego diego 383 Jul 23 20:30 package-lock.json
drwxrwxr-x 2 diego diego 4096 Jul 24 12:48 replays
After a bad run with the EXTBSY error:
total 212
drwx------ 6 diego diego 4096 Jul 24 12:49 .
drwxr-xr-x 16 diego diego 4096 Jul 24 12:46 ..
-rwxr-xr-x 1 diego diego 105 Jul 23 20:30 compile.bat
-rwxr-xr-x 1 diego diego 118 Jul 23 20:30 compile.sh
drwxrwxr-x 2 diego diego 4096 Jul 24 12:49 errorlogs
drwx------ 3 diego diego 4096 Jul 23 20:30 internals
drwx------ 3 diego diego 4096 Jul 23 20:30 lux
-rw-rw-r-- 1 diego diego 4441 Jul 23 20:30 main.cpp
-rwxrwxr-x 1 diego diego 162032 Jul 24 12:49 main.out
-rw-rw-r-- 1 diego diego 1928 Jul 23 20:30 main.py
-rw-rw-r-- 1 diego diego 265 Jul 23 20:30 package.json
-rw-rw-r-- 1 diego diego 383 Jul 23 20:30 package-lock.json
drwxrwxr-x 2 diego diego 4096 Jul 24 12:48 replays
As a test, I ran sudo chmod +x
on main.py, main.cpp, and main.out and still got the EXTBSY error. I did the same thing to only main.py and main.cpp on a fresh run of simple and got the same result.
Hi @Desperationis can you try this script as a test. Put this script into the same directory you call lux-ai-2021
from. Replace line 2 with the correct CWD. (so remove 'path/to/dir/of/main.out').
const { spawn } = require('child_process');
const p = spawn('./main.out', {cwd: 'path/to/dir/of/main.out');
p.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
p.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});
p.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});
let me know if the EXTBSY error pops up.
I tried the JS script with node and the program just hung; It didn't produce any output on stdout or stderr, and it was like if you just ran main.out raw.
Btw you missed a }
on line 2 of the code.
But yeah, no EXTBSY error. Idk if this piece of information is useful or anything but running main.out raw from the command line never produces the error, though it doesn't produce any output either.
ok this is helpful. It's supposed to just hang as the agent is waiting for input (match information and state) and didn't quit with a EXTBSY error. Small chance the bug may be caused by the cross-spawn
package. Basically normally this is how we open your bot as shown in the stack trace
Error: spawn ETXTBSY
at ChildProcess.spawn (internal/child_process.js:403:11)
at spawn (child_process.js:580:9)
that just called the same command.
A safe solution is to just use the kaggle-environments match running tool, although less recommended but still usable and probably doesn't break. Information on using that will be released later.
Alright, got it
@Desperationis Can you test our latest dockerized version of the lux-ai-2021 tool? there's instructions here: https://github.com/Lux-AI-Challenge/Lux-Design-2021#cli-docker
It matches the lux-ai-2021 tool 1-1 (also please use the compile.sh tool in the new C++ starter kit and pass in main.out as the bot files)
So copy over the cli.sh
file and simply run
sh cli.sh src/main.out src/main.out
and you should be good to go.
@StoneT2000 I tried your docker image and I wasn't able to have the container read neither the executable or source file. Keep in mind I have very little docker experience. The script worked fine, though it only worked with bash cli.sh
and not sh cli.sh
on my system; Not sure why. You might also want to add the sudo
prefix on the command tbh. Lmk what I should I do about this; The error is here:
diego@adhoc:~/Desktop/Lux-Design-2021-master/kits/cpp/simple$ sudo bash cli.sh src/main.out src/main.out
-=-=-=-=-=-=-=-=-=-=-=-| [INFO] match_VKyPzeUzVHhO |-=-=-=-=-=-=-=-=-=-=-=-
[INFO] (match_VKyPzeUzVHhO) - Design: lux_ai_2021 | Initializing match - ID: VKyPzeUzVHhO, Name: match_VKyPzeUzVHhO
AgentFileError: src/main.out does not exist, check if file path provided is correct
at AgentFileError.AgentError [as constructor] (/usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/DimensionError/AgentError/index.js:31:28)
at new AgentFileError (/usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/DimensionError/AgentError/index.js:124:28)
at new Agent (/usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/Agent/index.js:176:23)
at /usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/Agent/index.js:1000:29
at Array.forEach (<anonymous>)
at Function.Agent.generateAgents (/usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/Agent/index.js:995:19)
at Match.<anonymous> (/usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/Match/index.js:221:53)
at step (/usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/Match/index.js:33:23)
at Object.next (/usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/Match/index.js:14:53)
at fulfilled (/usr/local/nvm/versions/node/v14.16.0/lib/node_modules/@lux-ai/2021-challenge/node_modules/dimensions-ai/lib/main/Match/index.js:5:58) {
agentID: 0
}
Yes you need bash, I'll update the readme later.
So you are saying you still can't run a game when using bash cli.sh?
Yeah, but hey, at least it's not the same error. All I know is that the docker pseudo-vm cannot find the executable or cpp file in my local machine.compile.sh
works completely fine with no errors, so I think it's cli.sh
parameters that are causing me issue. No matter what I throw at it, I get the same "no such file or directory" error.
Probably due to the bind mount not working correctly, or just me not using the script correctly. Here, main.out is a legitimate file created by compile.sh
.
Chiming in here that I am able to reproduce this issue. Here are some results that I found:
bash cli.sh ./kits/cpp/simple/main.cpp ./kits/cpp/simple/main.cpp
This command reproduces the issue about 50% of the time. pastebin of the output. The other 50% of the time, it runs successfully.
bash cli.sh ./kits/cpp/simple/main.cpp ./kits/python/simple/main.py
This command always works for me (cpp starter + python starter).
bash cli.sh ./kits/cpp/simple/main.out ./kits/cpp/simple/main.out
If main.out exists, this command always works for me (precompiled cpp starter + precompiled cpp starter).
This is a stab in the dark, but could there be an issue with two compilation processes trying to write to the same file? Maybe agent 0 is trying to execute main.out, but agent 1 is trying to write to main.out. I'm not exactly sure how the game engine runs under the hood.
@djkeyes That is very, very peculiar. Personally, I never encountered the ETXTBSY error again once I used the docker container, only the ENOENT error mentioned above but me and @StoneT2000 were able to resolve that. Maybe check sudo docker ps
to see if you're running two instances of the docker image? Keep in mind I only run with two executable files, not cpp files.
I think you're right though on the write-conflict. The command sees both files as individual files and tries to compile a main.out
for each, then load it into memory. Because of this, if you try to compile two .cpp files at the same time, the program tries to make two main.out's at the same time (possibly) through multithreading, One might finish earlier before the other even begins, and have time to load it into memory before the other starts a new main.out
file. What also might be happening is that the file library used to read each file may not close the file in time, also leading to different writing times. This explains the 50/50 random odds of you being able to compile.
This also explain the py-cpp and out-out combinations. py-cpp only needs to transpile a single cpp file, while the out-out combination can read it directly.
If this is indeed is what is happening, a possible solution for this is to simply name each transpiled executable either 1 or 2 depending on where it is in the parameters. This way, no confict occurs.
@djkeyes That is very, very peculiar. Personally, I never encountered the ETXTBSY error again once I used the docker container, only the ENOENT error mentioned above but me and @StoneT2000 were able to resolve that. Maybe check
sudo docker ps
to see if you're running two instances of the docker image? Keep in mind I only run with two executable files, not cpp files.I think you're right though on the write-conflict. The command sees both files as individual files and tries to compile a
main.out
for each, then load it into memory. Because of this, if you try to compile two .cpp files at the same time, the program tries to make two main.out's at the same time (possibly) through multithreading, One might finish earlier before the other even begins, and have time to load it into memory before the other starts a newmain.out
file. What also might be happening is that the file library used to read each file may not close the file in time, also leading to different writing times. This explains the 50/50 random odds of you being able to compile.This also explain the py-cpp and out-out combinations. py-cpp only needs to transpile a single cpp file, while the out-out combination can read it directly.
If this is indeed is what is happening, a possible solution for this is to simply name each transpiled executable either 1 or 2 depending on where it is in the parameters. This way, no confict occurs.
OH you two may be right! I never knew this could be an issue.
Chiming in here that I am able to reproduce this issue. Here are some results that I found:
bash cli.sh ./kits/cpp/simple/main.cpp ./kits/cpp/simple/main.cpp
This command reproduces the issue about 50% of the time. pastebin of the output. The other 50% of the time, it runs successfully.
bash cli.sh ./kits/cpp/simple/main.cpp ./kits/python/simple/main.py
This command always works for me (cpp starter + python starter).
bash cli.sh ./kits/cpp/simple/main.out ./kits/cpp/simple/main.out
If main.out exists, this command always works for me (precompiled cpp starter + precompiled cpp starter).This is a stab in the dark, but could there be an issue with two compilation processes trying to write to the same file? Maybe agent 0 is trying to execute main.out, but agent 1 is trying to write to main.out. I'm not exactly sure how the game engine runs under the hood.
I believe you are exactly right! I think we will change the documentation to instead tell users to pass in main.out as the file instead of main.cpp.
Our CLI tool's backing engine has the problem here: https://github.com/StoneT2000/Dimensions/blob/master/src/MatchEngine/index.ts#L107
For each agent we initialize, if the file given is .cpp, we run a default C++ compilation command (in hindsight, not very smart, should let the user decide on how to compile it better, maybe add a option to submit a compile.sh file (barring security problems with that)). However, in the line above, each agent is asynchronously compiled lmao, so two processes are probably trying to write at the same time or something.
I'll leave this issue open in case others encounter this and i've added (ETXTBSY error) to the title so its more discoverable, thanks!
So I've heard, you can run cpp files directly into lux-ai-2021. However, when I try to run it with my own bot, I get this error. https://pastebin.com/CQDev0Kw error in its entirety.
I get this error when trying to run the v1.1.x branch of the cpp kit as well; Not sure what is causing it. It works fine when transpiled into JS. I've tried running the
lux-ai-2021
command assudo
and notsudo
, though it doesn't make a difference. Here's the exactmain.cpp
code I tried to run in simple/: