OpenGridForum / drmaa2-python

DRMAAv2 language binding and reference implementation for Python
Apache License 2.0
6 stars 1 forks source link

follow the output of qacct for naming conventions of class members #4

Open mylons opened 9 years ago

mylons commented 9 years ago

not using consistent naming between existing utilities, like qacct, is confusing at best. Here's an example: in the Job class you're using jobId instead of jobnumber -- full output for a job from qacct is below:

==============================================================
qname        prod-r2v
hostname     pipeline-1a12-16l.xxx.xxxx.co
group        xgroup
owner        xowner
project      NONE
department   defaultdepartment
jobname      SomeName
jobnumber    647723
taskid       undefined
account      sge
priority     0
qsub_time    Wed Aug  5 20:52:36 2015
start_time   Wed Aug  5 20:52:19 2015
end_time     Wed Aug  5 20:52:45 2015
granted_pe   smp
slots        1
failed       0
exit_status  0
ru_wallclock 26
ru_utime     4.033
ru_stime     3.145
ru_maxrss    99532
ru_ixrss     0
ru_ismrss    0
ru_idrss     0
ru_isrss     0
ru_minflt    129741
ru_majflt    104
ru_nswap     0
ru_inblock   238872
ru_oublock   1328
ru_msgsnd    0
ru_msgrcv    0
ru_nsignals  0
ru_nvcsw     30244
ru_nivcsw    485691
cpu          7.178
mem          3.222
io           0.035
iow          0.000
maxvmem      1.169G
arid         undefined
jc_name      NONE
mylons commented 9 years ago

@troeger: I'm currently implementing the Job class, and instantiating a template from the namedtuple is giving me issues as a result of the above:

JobTemplate = namedtuple('JobTemplate', ['remote_command', 'args', 'submit_as_hold', 'rerunnable',
                                         'job_environment', 'working_directory', 'job_category',
                                         'email', 'email_on_started', 'email_on_terminated', 'job_name',
                                         'input_path', 'output_path', 'error_path', 'join_files',
                                         'reservation_id', 'queue_name', 'min_slots', 'max_slots',
                                         'priority', 'candidate_machines', 'min_phys_memory', 'machine_os',
                                         'machine_arch', 'start_time', 'deadline_time', 'stage_in_files',
                                         'stage_out_files', 'resource_limits', 'accounting_id']

doesn't map to the output of Univa's output from qacct and requires special handling for every single member.

troeger commented 9 years ago

Yes, you have to make an internal attribute translation to the specific DRM system. This is the price to pay for implementing a standardized interface. I would propose that you support both things - the standardized attribute names and the implementation-specific names. @dgruber definitely also has an opinion on that.

dgruber commented 9 years ago

I would certainly give naming schemes (camel case / upper case / underscores etc.) which are common to the implementation language the preference. Like @troeger said the actual naming was part of the standarization process. Even the DRMAA2 C language binding shipped with Univa Grid Engine is using the standardized names rather than Grid Engine names, by purpose.

This is the Go and JSON version (https://github.com/dgruber/drmaa2/blob/master/drmaa2.go):

type JobTemplate struct {
    Extension         `xml:"-" json:"-"`
    RemoteCommand     string            `json:"remoteCommand"`
    Args              []string          `json:"args"`
    SubmitAsHold      bool              `json:"submitAsHold"`
    ReRunnable        bool              `json:"reRunnable"`
    JobEnvironment    map[string]string `json:"jobEnvironment"`
    WorkingDirectory  string            `json:"workingDirectory"`
    JobCategory       string            `json:"jobCategory"`
    Email             []string          `json:"email"`
    EmailOnStarted    bool              `json:"emailOnStarted"`
    EmailOnTerminated bool              `json:"emailOnTerminated"`
    JobName           string            `json:"jobName"`
    InputPath         string            `json:"inputPath"`
    OutputPath        string            `json:"outputPath"`
    ErrorPath         string            `json:"errorPath"`
    JoinFiles         bool              `json:"joinFiles"`
    ReservationId     string            `json:"reservationId"`
    QueueName         string            `json:"queueName"`
    MinSlots          int64             `json:"minSlots"`
    MaxSlots          int64             `json:"maxSlots"`
    Priority          int64             `json:"priority"`
    CandidateMachines []string          `json:"candidateMachines"`
    MinPhysMemory     int64             `json:"minPhysMemory"`
    MachineOs         string            `json:"machineOs"`
    MachineArch       string            `json:"machineArch"`
    StartTime         time.Time         `json:"startTime"`
    DeadlineTime      time.Time         `json:"deadlineTime"`
    StageInFiles      map[string]string `json:"stageInFiles"`
    StageOutFiles     map[string]string `json:"stageOutFiles"`
    ResourceLimits    map[string]string `json:"resourceLimits"`
    AccountingId      string            `json:"accountingString"`
}

This is the ANSI C counterpart:

typedef struct {
   drmaa2_string      remoteCommand;
   drmaa2_string_list args;
   drmaa2_bool        submitAsHold;
   drmaa2_bool        rerunnable;
   drmaa2_dict        jobEnvironment;
   drmaa2_string      workingDirectory;
   drmaa2_string      jobCategory;
   drmaa2_string_list email;
   drmaa2_bool        emailOnStarted;
   drmaa2_bool        emailOnTerminated;
   drmaa2_string      jobName;
   drmaa2_string      inputPath;
   drmaa2_string      outputPath;
   drmaa2_string      errorPath;
   drmaa2_bool        joinFiles;
   drmaa2_string      reservationId;
   drmaa2_string      queueName;
   long long          minSlots;
   long long          maxSlots;
   long long          priority;
   drmaa2_string_list candidateMachines;
   long long          minPhysMemory;
   drmaa2_os          machineOS;
   drmaa2_cpu         machineArch;
   time_t             startTime;
   time_t             deadlineTime;
   drmaa2_dict        stageInFiles;
   drmaa2_dict        stageOutFiles;
   drmaa2_dict        resourceLimits;
   drmaa2_string      accountingId;
   void*              implementationSpecific;
} drmaa2_jtemplate_s;
typedef drmaa2_jtemplate_s * drmaa2_jtemplate;

Hope that helps...

Cheers