Closed jamestwebber closed 5 years ago
Thanks, looking. Yes, this stems from a change to aegea batch
that I introduced in a recent refactor.
Fixed in v2.6.6, please test. Sorry about the disruption.
Hm I think a couple other things changed which I will need to debug... when I tested a job I got this error:
Traceback (most recent call last):
File "/usr/local/bin/aegea", line 23, in <module>
aegea.main()
File "/usr/local/lib/python3.5/dist-packages/aegea/__init__.py", line 89, in main
result = parsed_args.entry_point(parsed_args)
File "/usr/local/lib/python3.5/dist-packages/aegea/ebs.py", line 65, in create
return attach(parser_attach.parse_args([res["VolumeId"]], namespace=args))
File "/usr/local/lib/python3.5/dist-packages/aegea/ebs.py", line 139, in attach
logger.info("Formatting %s (%s)", args.volume_id, find_devnode(args.volume_id))
File "/usr/local/lib/python3.5/dist-packages/aegea/ebs.py", line 109, in find_devnode
raise Exception("Could not find devnode for
{}
".format(volume_id))
Exception: Could not find devnode for vol-073d4a4404fbac5d1
Detaching EBS volume
usage: aegea ebs detach [-h] [--max-col-width MAX_COL_WIDTH] [--json]
[--log-level {WARNING,CRITICAL,DEBUG,INFO,ERROR}]
[--unmount] [--delete] [--force] [--dry-run]
volume_id
aegea ebs detach: error: the following arguments are required: volume_id
The command run was
aegea batch submit --queue aegea_batch --vcpus 16 --memory 64000 --ecr-image aligner --storage /mnt=500 --command 'PATH=$HOME/anaconda/bin:$PATH; cd utilities; git pull; git checkout master; python setup.py install; python -m utilities.alignment.run_star_and_htseq --taxon mm10-plus --num_partitions 100 --partition_id 0 --s3_input_path s3://czb-seqbot/fastqs/190906_A00111_0366_AHNKGFDSXX/ --s3_output_path s3://czb-maca/Plate_seq/parabiosis/190906_A00111_0366_AHNKGFDSXX/mm10/'
Not sure if you have access to our logs but it's job 2281ef07-34d9-4143-8913-45da0152f124
That is a different issue, discussed in #47. The short-term fix is to use m5/r5/c5 family instances. I'm working on a longer term solution.
The issue discussed in #47 should be fixed in v2.6.8 (on all instance types), please test.
It seems that a recent change has broken our previous workflow somehow. We have scripts that build up a command for
aegea batch
and then use subprocess to submit the job, like so:Which now raises the error:
This happens with versions v2.6.4 and v2.6.5. I recommended downgrading to a known working version of aegea as a quick fix, and they reported success with v2.3.6. We could try bisecting that space but hopefully you will have a better idea what happened. My guess is that there's some issue with how aegea is building the batch command? Maybe due to the complexity of the command we're submitting?