ServiceNow / duorat

DuoRAT is a ServiceNow Research project that was started at Element AI.
Other
55 stars 22 forks source link

Problems loading jsonnet files #9

Open hclent opened 3 years ago

hclent commented 3 years ago

Hey there!

I am trying to run the DuoRat code from the Docker container, and experience issues with loading the jsonnet files

root@duorat:/app# python scripts/train.py --config configs/duorat/duorat-finetune-bert-large.jsonnet --logdir logdir/duorat-bert
Traceback (most recent call last):
  File "scripts/train.py", line 519, in <module>
    main()
  File "scripts/train.py", line 486, in main
    config = json.loads(_jsonnet.evaluate_file(args.config))
RuntimeError: RUNTIME ERROR: couldn't open import "../../data/train.libsonnet": no match locally or in the Jsonnet library paths.
    configs/duorat/duorat-base.libsonnet:5:17-52    object <anonymous>
    configs/duorat/duorat-base.libsonnet:(4:11)-(7:6)   object <anonymous>
    During manifestation    

And if I sanity check and look in the configs, I am able to find these files

root@duorat:/app# ls configs/duorat/
duorat-12G.jsonnet     duorat-bert.jsonnet  duorat-finetune-bert-base.jsonnet                  duorat-finetune-bert-large.jsonnet   duorat-good-no-bert.jsonnet             duorat-new-db-content.jsonnet
duorat-base.libsonnet  duorat-dev.jsonnet   duorat-finetune-bert-large-attention-maps.jsonnet  duorat-good-no-bert-no-from.jsonnet  duorat-new-db-content-no-whole.jsonnet

I have of course already googled around to see what the source of this error could be, but I can't seem to find anything helpful enough. Would you advise maybe that I try to hardcode this, or do you have any intuition as to why I might be getting this error?

-- Edit with more details

I suspect the problem may have to do with this command, where /logdir and /data are remounted onto home.

nvidia-docker run -it -u $(id -u ${USER}) --name my_duorat --rm -v $PWD/logdir:/logdir -v $PWD/data/:/app/data duorat

Because I am not running the docker container locally, I did not use this command. Rather, I created an image with google cloud platform, and then I spin up the image as an interactive bash session to work with. I tried making symlinks and also using mount where the docker command re-mounts them, but that hasn't fixed the problem. For example:

root@duorat:/app# mount $PWD/data/ /app/data
mount: /app/data/: mount point does not exist.

Thanks so much!

tscholak commented 3 years ago

Hi @hclent, thanks for checking out our code! The problem you describe seems to be related to the fact that the data directory is empty. ../../data/train.libsonnet is a jsonnet file that defines which databases and examples are considered training data. We put this information with the Spider data itself rather than in the config folder. Since the data folder is unavailable, the train.libsonnet file can't be loaded by the jsonnet interpreter, and the script crashes. Now, it seems the solution is to make the data available in that /app/data folder. I'm not sure what your google cloud setup is, but I think your last mount command fails because the /app/data folder doesn't exist. You should try creating it with mkdir. docker does that for you, but when you use mount you have to do it yourself. Furthermore, since mount is from a different time where the typical workflow was to mount partitions and not folders, you need to specify the method by which the folder should be mounted. I recommend to try:

$ mount -o bind,ro $PWD/data/ /app/data

Let me know if this solves your case :)