Open levinas opened 9 years ago
To me, it looks like the operator should be 'cmp' instead of ‘<=>’ since the sequence field of the contigobject is a string and not a number.
I’m not sure the impact of changing the operator. The _build_sequence_object() method is building either a ContigSet or ProteinSet object from a user provided fasta file of DNA sequences or protein sequences. The MD5 of all of the sequences is used to generate an ID for the new object using the register_ids() method from the ID server. It does not seem like that should be an issue since the object IDs are just a number.
The MD5 is also saved in the object — maybe that is used by the workspace service’s versioning support to understand when an object has changed and a new version needs to be created? I suppose there is a chance that a MD5 generated with the updated sort order could match a MD5 generated with the current sort order and cause the same ID to be returned and the workspace service to create a new version of an existing object instead of a new object.
I’m not seeing any impacts from the array of sequences being in a different order but maybe I’m missing something subtle.
Also, the cmp operator is used when doing the same processing on line 816 for building a ContigSet for a SEED genome and line 1000 for RAST genome.
Mike Mundy
From: Fangfang Xia notifications@github.com<mailto:notifications@github.com> Reply-To: kbase/KBaseFBAModeling reply@reply.github.com<mailto:reply@reply.github.com> Date: Friday, November 21, 2014 at 5:10 PM To: kbase/KBaseFBAModeling KBaseFBAModeling@noreply.github.com<mailto:KBaseFBAModeling@noreply.github.com> Subject: [KBaseFBAModeling] string comparison in build_sequence_object (#21)
Is this line trying to sort sequences before computing an MD5? Should the operator be 'cmp' ?
https://github.com/kbase/KBaseFBAModeling/blob/master/lib/Bio/KBase/fbaModelServices/Impl.pm#L2457
— Reply to this email directly or view it on GitHubhttps://github.com/kbase/KBaseFBAModeling/issues/21.
I agree. I think this should be "cmp" and changing this is unlikely to break anything.
Is this line trying to sort sequences before computing an MD5? Should the operator be 'cmp' ?
https://github.com/kbase/KBaseFBAModeling/blob/master/lib/Bio/KBase/fbaModelServices/Impl.pm#L2457