Open rcannood opened 1 year ago
Initial implementation, reusing existing arguments:
case class H5adFileArgument(..., slots: H5adSlots) extends AbstractFileArgument
case class FileArgument(...) extends AbstractFileArgument
case class IntegerArgument extends BaseArgument[Int]
abstract class AbstractFileArgument extends Argument[Path]
abstract class BaseArgument[Type] extends Argument[Type]
// None of these arguments should have examples, defaults, directions, multiple, multiple_sep defined
case class H5adSlots(
X: Option[BaseArgument[_]],
layers: List[BaseArgument[_]],
obs: List[BaseArgument[_]],
obsp: List[BaseArgument[_]],
obsm: Map[String, BaseArgument[_]],
var: List[BaseArgument[_]],
varp: List[BaseArgument[_]],
varm: Map[String, BaseArgument[_]],
uns: List[BaseArgument[_]]
)
This doesn't work, because you need to be able to add Lists and Data Frames to uns and Data frames to obsm and varm.
--
Wip:
// new arguments
abstract class AbstractFileArgument extends Argument[Path]
case class H5adFileArgument(..., slots: H5adSlots) extends AbstractFileArgument
case class FileArgument(...) extends AbstractFileArgument
// helper classes for h5adslots
abstract class H5adValue {
val `type`: String
val name: String
val description: Option[String]
val required: Boolean // default: true
}
case class H5adIntegerValue(...) extends H5adValue { ... }
case class H5adDoubleValue(...) extends H5adValue { ... }
case class H5adLongValue(...) extends H5adValue { ... }
case class H5adStringValue(...) extends H5adValue { ... }
case class H5adBooleanValue(...) extends H5adValue { ... }
case class H5adDictValue(..., values: List[H5adValue]) extends H5adValue { ... }
case class H5adSlots(
X: Option[H5adValue],
layers: List[H5adValue],
obs: List[H5adValue],
obsp: List[H5adValue],
obsm: Map[String, H5adValue],
var: List[H5adValue],
varp: List[H5adValue],
varm: Map[String, H5adValue],
uns: List[H5adValue]
)
I'm starting to have a lot of components which have arguments like this:
arguments:
- name: "--output"
type: file
direction: output
description: The output h5ad file.
example: output.h5ad
info:
slots:
obsm:
- type: double
name: X_pca
description: The resulting PCA embedding.
required: true
varm:
- type: double
name: pca_loadings
description: The PCA loadings matrix.
required: true
uns:
- type: double
name: pca_variance
description: The PCA variance objects.
required: true
- name: "--obsm_embedding"
type: string
default: "X_pca"
description: "In which .obsm slot to store the resulting PCA embedding."
- name: "--varm_loadings"
type: string
default: "pca_loadings"
description: "In which .varm slot to store the PCA loadings matrix."
- name: "--uns_variance"
type: string
default: "pca_variance"
description: "In which .uns slot to store the PCA variance objects."
The slots is used mostly for documentation and to automate gatekeeper components, but it's starting to become a hassle to manually type all of these slot arguments. type: h5ad_file
could help with reducing some of the boilerplate code, that is:
arguments:
- name: "--output"
type: file
direction: output
description: The output h5ad file.
example: output.h5ad
slots:
obsm:
- type: double
name: --output_embedding
description: The resulting PCA embedding.
required: true
default: X_pca
varm:
- type: double
name: --output_loadings
description: The PCA loadings matrix.
required: true
default: pca_loadings
uns:
- type: double
name: --output_variance
description: The PCA variance objects.
required: true
default: pca_variance
The point being that by specifying the slots for an --output
argument, Viash automatically adds the following arguments: --output_embedding
, --output_loadings
, --output_variance
Comment by @tverbeiren :
I think that makes sense, I'm just a bit worried about the possible confusion: slots versus arguments. They look very similar now.
A suggestion: make the mapping to an argument more explicit (albeit optional):
slots:
obsm:
description: The resulting PCA embedding.
type: double
maps_to_argument:
name: --output_embedding
required: true
default: X_pca
...
The returning arguments could be handled using includes as well, no? Would that make sense or not?
Having extra metadata you can add to denote the interface of an h5ad and h5mu file can be used later on for checking whether the files adhere to a certain interface.
Right now a viash config doesn't allow for specifying the schema of an h5ad file:
Example of proposed functionality:
I suggest these fields don't have a functional impact at the moment.