abhilekhsingh / gc3pie

Automatically exported from code.google.com/p/gc3pie
0 stars 0 forks source link

gcodeml ignores the `-o` option and puts jobs in the default output path #190

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hello,

Please have a look at the following example for which the renaming of input 
directories does not work.

gcodeml -o test.jobs test.jobs

test.jobs/
-- FAM_1
    |-- FAM_1.1
    |   |-- FAM_1.1.H0.ctl
    |   |-- FAM_1.1.H1.ctl
    |   |-- FAM_1.1.nwk
    |   `-- FAM_1.1.phy
    `-- FAM_1.2
        |-- FAM_1.2.H0.ctl
        |-- FAM_1.2.H1.ctl
        |-- FAM_1.2.nwk
        `-- FAM_1.2.phy

Status of jobs in the 'gcodeml' session: (at 13:44:19, 06/17/11)
       NEW   0/2    (0.0%)  
   STOPPED   0/2    (0.0%)  
TERMINATED   2/2   (100.0%) 
    failed   2/2   (100.0%)

Also find the ginfo log below:

job.43502
    _attached: False
    _grid: <gc3libs.__NoGrid object at 0x2b25adc4f390>
    arguments: FAM_1.2.H0.ctl, FAM_1.2.H1.ctl
    codeml: /Home/akuzniar/
    environment: 
    executable: codeml.pl
    execution: 
        _exitcode: 1
        _signal: 0
        _state: TERMINATED
        ...
    exit_code: 0

But in fact the output files were downloaded to gcodeml.out rather than 
test.jobs dir:

gcodeml.out/
|-- FAM_1.1
|   |-- FAM_1.1.mlc
|   |-- codeml.stderr.txt
|   `-- codeml.stdout.txt
`-- FAM_1.2
    |-- FAM_1.2.mlc
    |-- codeml.stderr.txt
    `-- codeml.stdout.txt

Moreover, the input dir structure is "flattened" out.

Ciao,
 A.

Original issue reported on code.google.com by arnold.k...@gmail.com on 20 Jun 2011 at 8:57

GoogleCodeExporter commented 9 years ago

Original comment by riccardo.murri@gmail.com on 23 Jun 2011 at 9:50

GoogleCodeExporter commented 9 years ago

Original comment by riccardo.murri@gmail.com on 1 Jul 2011 at 2:21

GoogleCodeExporter commented 9 years ago
Is this still reproducible with the latest version of `gcodeml` from
`trunk`?

If I run it locally, I get this:

    $ ./gcodeml.py -s XXX -o OUTPUT/PATH/NAME -C45 2_hier_fams/
    ...
    Status of jobs in the 'XXX' session: (at 13:10:39, 09/06/11)
            NEW   0/4    (0.0%)  
        RUNNING   0/4    (0.0%)  
        STOPPED   0/4    (0.0%)  
      SUBMITTED   0/4    (0.0%)  
     TERMINATED   4/4   (100.0%) 
    TERMINATING   0/4    (0.0%)  
         failed   3/4   (75.0%)  
             ok   1/4   (25.0%)  
          total   4/4   (100.0%) 

    $  tree OUTPUT/
    OUTPUT/
    └── home
        └── rmurri
            └── gc3
                └── gc3pie
                    └── trunk
                        └── gc3pie
                            └── gc3apps
                                └── codeml
                                    └── 2_hier_fams
                                        ├── FAM_1
                                        │   ├── FAM_1.1
                                        │   │   └── FAM_1.1.out
                                        │   │       ├── codeml.stderr.txt
                                        │   │       ├── codeml.stdout.txt
                                        │   │       ├── FAM_1.1.H0.mlc
                                        │   │       └── FAM_1.1.H1.mlc
                                        │   └── FAM_1.2
                                        │       └── FAM_1.2.out
                                        │           ├── codeml.stderr.txt
                                        │           ├── codeml.stdout.txt
                                        │           ├── FAM_1.2.H0.mlc
                                        │           └── FAM_1.2.H1.mlc
                                        └── FAM_2
                                            ├── FAM_2.1
                                            │   └── FAM_2.1.out
                                            │       ├── codeml.stderr.txt
                                            │       ├── codeml.stdout.txt
                                            │       ├── FAM_2.1.H0.mlc
                                            │       └── FAM_2.1.H1.mlc
                                            └── FAM_2.2
                                                └── FAM_2.2.out
                                                    ├── codeml.stderr.txt
                                                    ├── codeml.stdout.txt
                                                    ├── FAM_2.2.H0.mlc
                                                    └── FAM_2.2.H1.mlc

    19 directories, 16 files

Note that:

0) the default of `gcodeml` is now to put the output directories in
   the same place where the inputs are: so, if you do not specify any
   "-o" option, you get a "FAM_1.1.out" directory *inside* the
   "FAM_1.1" directory.

1) the flattening of the output directory hierarchy is intentional;
   you have to specify the trailing part "PATH/NAME" if you want to
   keep the output directory nesting.

   I agree that PATH/NAME is too verbose (as in the above output),
   because it replicates the entire directory path from the root of
   the filesystem.

2) I'm uncertain how we can overcome this: how to know how many
   directories should be stripped from the beginning of PATH?

   Imagine you run `gcodeml ../../a ./b -o output/PATH/NAME`: how
   should the the `output` directory look like?  Should it contain
   subdirectories `a` and `b` right under it?  Ok, but then what
   happens when you run `gcodeml ../variant1/data ../variant2/data`?
   Then we have two subdirectories which are *both* named `data`...

   In addition, what should happen if you start `gcodeml` in one
   directory; have all jobs submitted; then stop it, and restart in a
   different directory: now the relative paths have changed: which
   should be used for computing PATH?  The old ones or the new ones?

Original comment by riccardo.murri@gmail.com on 6 Sep 2011 at 1:10