kbaseattic / assembly

An extensible framework for genome assembly.
MIT License
12 stars 14 forks source link

Fix plugin to deal with SPAdes core dumps #304

Open levinas opened 9 years ago

levinas commented 9 years ago

SPAdes v3.5.0 occassionally dumps to the directory from which the compute server is launched.

-rwxrwxr-x 1 ubuntu ubuntu       727 Jul 23  2014 start_service.kbase
-rwxrwxr-x 1 ubuntu ubuntu       128 Jul 23  2014 start_service
-rw-rw-r-- 1 ubuntu ubuntu         0 Jan 29 17:50 ?
-rw-rw-r-- 1 ubuntu ubuntu         0 Jan 30 05:43 @SQ?SN:NODE_68_length_2886_cov_2.78947_ID_137?LN:2886
-rw-rw-r-- 1 ubuntu ubuntu         0 Jan 30 05:43 @SQ?SN:NODE_1992_length_104_cov_6.95918_ID_3983?LN:104
-rw------- 1 ubuntu ubuntu 205926400 Mar  7 19:38 core
-rwxrwxr-x 1 ubuntu ubuntu      1255 Mar  9 03:15 start_control_server
-rwxrwxr-x 1 ubuntu ubuntu      1862 Mar  9 03:15 start_compute_server

We may need to set cwd in Popen().. Also single files should be passed in as --s# as opposed to --pe#-s?

levinas commented 9 years ago

There are two issues here:

  1. SPAdes fails for some small datasets (mostly at the final misassembly correction stage) solution: fall back to the contigs before the correction
  2. SPAdes creates empty files that pollute the server launch directory This is dealt with by running arast_popen() with cwd set to self.outpath

The strange thing is when I globally modified arast_popen() to always call subprocess.Popen(cwd=self.outpath..., it failed on tagdust. Tagdust just generates empty files.