jamesdolezal / slideflow

Deep learning library for digital pathology, with both Tensorflow and PyTorch support.
https://slideflow.dev
GNU General Public License v3.0
234 stars 39 forks source link

Concatenate error with multimodal input #282

Closed jinnyjuice closed 1 year ago

jinnyjuice commented 1 year ago

Description

Hello,

I have a couple of clinical variables (categorical and float) that I want to use as additional input. I list them as: multi_input = ["age", "sex", "height", "weight"]. Anyway, when I try to train a (very basic) model, I receive an error saying that tensor shapes don't match:

ConcatOp : Ranks of all input tensors should match: shape[0] = [16,4,1] vs. shape[1] = [16,2048] [[{{node model/input_merge/concat}}]] [Op:__inference_train_function_35947]

I believe there's something wrong with the way how the clinical variables are introduced into the other layers (slide_feature_input)? How can I change it? Or am I missing something trivial? Would be great if anybody could help! Cheers.

To Reproduce

Steps to reproduce the behavior:

1. Commands

import slideflow as sf
P = sf.load_project('project')
hp = sf.ModelParams(
    tile_px=299,
    tile_um=100,
)
multi_input = ["age", "sex", "height", "weight"]
P.train(
        'category',
        params=hp,
        val_strategy='none',
        input_header= multi_input,
        )

2. Output

[11:45:54] INFO Training model category-HP0...
INFO Hyperparameters: {
"augment": "xyrj",
"batch_size": 16,
"drop_images": false,
"dropout": 0,
"early_stop": false,
"early_stop_method": "loss",
"early_stop_patience": 0,
"epochs": [
3
],
"hidden_layer_width": 500,
"hidden_layers": 0,
"include_top": true,
"l1": 0.0,
"l1_dense": 0.0,
"l2": 0.0,
"l2_dense": 0.0,
"learning_rate": 0.0001,
"learning_rate_decay": 0,
"learning_rate_decay_steps": 100000,
"loss": "sparse_categorical_crossentropy",
"manual_early_stop_batch": null,
"manual_early_stop_epoch": null,
"model": "xception",
"normalizer": null,
"normalizer_source": null,
"optimizer": "Adam",
"pooling": "max",
"tile_px": 299,
"tile_um": 100,
"toplayer_epochs": 0,
"trainable_layers": 0,
"training_balance": "category",
"uq": false,
"validation_balance": "none"
}
INFO Val settings: {
"strategy": "none",
"k_fold": 3,
"k": null,
"k_fold_header": null,
"fraction": null,
"source": null,
"annotations": null,
"filters": null,
"dataset": null
}
INFO Using 687 training TFRecords, 0 validation
INFO Adding input variable age as float
INFO Adding input variable sex as float
INFO Adding input variable height as float
INFO Adding input variable weight as float
[11:46:28] INFO Training with both images and 4 categories of slide-level input
2023-05-21 11:46:28.822013: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30948 MB memory: -> device: 0, nam e: Tesla V100-SXM2-32GB, pci bus id: 0000:61:00.0, compute capability: 7.0
[11:46:29] INFO Using pretraining from imagenet

Model: "model"

Layer (type) Output Shape Param # Connected to

tile_image (InputLayer) [(None, 299, 299, 3 0 []
)]

xception (Functional) (None, 2048) 20861480 ['tile_image[0][0]']

slide_feature_input (InputLaye [(None, 4)] 0 []
r)

post_convolution (Activation) (None, 2048) 0 ['xception[0][0]']

input_merge (Concatenate) (None, 2052) 0 ['slide_feature_input[0][0]',
'post_convolution[0][0]']

logits-0 (Dense) (None, 2) 4106 ['input_merge[0][0]']

out-0 (Activation) (None, 2) 0 ['logits-0[0][0]']

Total params: 20,865,586
Trainable params: 20,811,058
Non-trainable params: 54,528

[11:46:37] INFO Beginning training
Epoch 1/3
2023-05-21 11:46:51.242360: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:428] Loaded cuDNN version 8201
Traceback (most recent call last):
File "", line 1, in
File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/project.py", line 3378, in train
self._train_hp(
File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/project.py", line 709, in _train_hp
self._train_split(dataset, hp, val_settings, s_args)
File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/project.py", line 933, in _train_split
project_utils._train_worker(
File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/project_utils.py", line 147, in _train_worker
results = trainer.train(train_dts, val_dts, **training_kw)
File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/model/tensorflow.py", line 1925, in train
self.model.fit(
File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 52, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node 'model/input_merge/concat' defined at (most recent call last): File "", line 1, in File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/project.py", line 3378, in train self._train_hp( File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/project.py", line 709, in _train_hp self._train_split(dataset, hp, val_settings, s_args) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/project.py", line 933, in _train_split project_utils._train_worker( File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/project_utils.py", line 147, in _train_worker results = trainer.train(train_dts, val_dts, training_kw) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/slideflow/model/tensorflow.py", line 1925, in train self.model.fit( File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler return fn(*args, *kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/training.py", line 1650, in fit tmp_logs = self.train_function(iterator) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/training.py", line 1249, in train_function return step_function(self, iterator) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/training.py", line 1233, in step_function outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/training.py", line 1222, in run_step outputs = model.train_step(data) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/training.py", line 1023, in train_step y_pred = self(x, training=True) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler return fn(args, kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/training.py", line 561, in call return super().call(*args, kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler return fn(*args, *kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/base_layer.py", line 1132, in call outputs = call_fn(inputs, args, kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 96, in error_handler return fn(*args, kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/functional.py", line 511, in call return self._run_internal_graph(inputs, training=training, mask=mask) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/functional.py", line 668, in _run_internal_graph outputs = node.layer(*args, *kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler return fn(args, kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/engine/base_layer.py", line 1132, in call outputs = call_fn(inputs, *args, *kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 96, in error_handler return fn(args, kwargs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/layers/merging/base_merge.py", line 196, in call return self._merge_function(inputs) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/layers/merging/concatenate.py", line 134, in _merge_function return backend.concatenate(inputs, axis=self.axis) File "/data/jjung23/miniconda3/envs/sf_2/lib/python3.9/site-packages/keras/backend.py", line 3572, in concatenate return tf.concat([to_dense(x) for x in tensors], axis) Node: 'model/input_merge/concat' ConcatOp : Ranks of all input tensors should match: shape[0] = [16,4,1] vs. shape[1] = [16,2048] [[{{node model/input_merge/concat}}]] [Op:__inference_train_function_35947]**

Expected behavior

Successful training of a multimodal model.

Environment:

Additional context

jamesdolezal commented 1 year ago

Thanks for raising this issue - we'll build a test dataset over the next day or so and work on reproducing the error, so we can find the source of the problem.

In the meantime, do you see the same error when you train with only a single additional clinical variable? Try training 4 different models, one with each clinical variable as a single additional input, to see if the problem can be isolated to one of the variables.

jinnyjuice commented 1 year ago

Thanks for replying! I tried to feed the model only one clinical variable, for example "age". What I get is:

/data/jjung23/miniconda3/envs/sf_tensfl/lib/python3.10/site-packages/keras/engine/functional.py:638: UserWarning: Input dict contained keys ['slide_feature_input'] which did not match any model input. They will be ignored by the model.

After that it looks like it's training normally based on the slides but without the clinical variable.

jamesdolezal commented 1 year ago

Quick update - I was able to reproduce the problem when using continuous outcomes (like the ones you're using here). Categorical slide inputs (either single or multiple) are working as expected, but there seems to be an issue with continuous variables. Our automatic testing included testing categorical clinical variables only for these multi-input models, which is why this wasn't caught by our testing protocol.

I should have a patch out shortly that fixes the problem, and I'll expand our testing to include continuous input variables, as well.

jamesdolezal commented 1 year ago

Ok - patch has been applied for the Tensorflow backend. If you have the ability to run from source, let me know if it works on your end, as well. Still working on a fix for the PyTorch backend.

If this resolves the issue, I'll incorporate it into the next patch release.

jamesdolezal commented 1 year ago

Patch has been released as version 2.0.5.

jinnyjuice commented 1 year ago

Hi, sorry for the late replay. Thank you so much for the patch and the messages! However, I receive the following error:

[22:07:38] INFO     Beginning training                                                                                                                                           [31/1476]
Epoch 1/3                                                                                                                                                                                 
Traceback (most recent call last):                                                                                                                                                        
  File "/data/jjung23/23_04_30/3_train_1.py", line 29, in <module>                                                                                                                        
    P.train(                                                                                                                                                                              
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/slideflow/project.py", line 3426, in train                                                                
    self._train_hp(                                                                                                                                                                       
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/slideflow/project.py", line 713, in _train_hp                                                             
    self._train_split(dataset, hp, val_settings, s_args)                                                                                                                                  
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/slideflow/project.py", line 937, in _train_split                                                          
    project_utils._train_worker(                                                                                                                                                          
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/slideflow/project_utils.py", line 147, in _train_worker                                                   
    results = trainer.train(train_dts, val_dts, **training_kw)                                                                                                                            
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/slideflow/model/tensorflow.py", line 1924, in train                                                       
    self.model.fit(                                                                                                                                                                       
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler                                                
    raise e.with_traceback(filtered_tb) from None                                                                                                                                         
  File "/tmp/__autograph_generated_filevedfejsj.py", line 15, in tf__train_function                                                                                                       
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)                                                                               
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 129, in from_value                       
    return default_types.Tuple(*(from_value(c, context) for c in value))                                                                                                                  
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 129, in <genexpr>                        
    return default_types.Tuple(*(from_value(c, context) for c in value))                                                                                                                  
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 129, in from_value                       
    return default_types.Tuple(*(from_value(c, context) for c in value))                                                                                                                  
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 129, in <genexpr>                        
    return default_types.Tuple(*(from_value(c, context) for c in value))                                                                                                                  
  File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 152, in from_value                       
    raise TypeError(                                                                                                                                                                      
TypeError: in user code:                                                                                                                                                                  

    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/engine/training.py", line 1249, in train_function  *                                              
        return step_function(self, iterator)
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/engine/training.py", line 1233, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/engine/training.py", line 1222, in run_step  **
        outputs = model.train_step(data)
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/engine/training.py", line 1027, in train_step
        self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 527, in minimize
        self.apply_gradients(grads_and_vars)
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/mixed_precision/loss_scale_optimizer.py", line 1331, in apply_gradients
        tf.__internal__.smart_cond.smart_cond( 
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/mixed_precision/loss_scale_optimizer.py", line 1329, in apply_fn
        return self._apply_gradients(grads, wrapped_vars)
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/mixed_precision/loss_scale_optimizer.py", line 1361, in _apply_gradients
        self._optimizer.apply_gradients(
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1140, in apply_gradients
        return super().apply_gradients(grads_and_vars, name=name)
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 634, in apply_gradients
        iteration = self._internal_apply_gradients(grads_and_vars)
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1166, in _internal_apply_gradients
        return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1216, in _distributed_apply_gradients_fn
        distribution.extended.update(
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1211, in apply_grad_to_update_var
        return self._update_step_xla(grad, var, id(self._var_key(var)))
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 129, in from_value
        return default_types.Tuple(*(from_value(c, context) for c in value))
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 129, in <genexpr>
        return default_types.Tuple(*(from_value(c, context) for c in value))
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 129, in from_value
        return default_types.Tuple(*(from_value(c, context) for c in value))
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 129, in <genexpr>
        return default_types.Tuple(*(from_value(c, context) for c in value))
    File "/data/jjung23/miniconda3/envs/sf_tensorflow/lib/python3.9/site-packages/tensorflow/core/function/trace_type/trace_type_builder.py", line 152, in from_value
        raise TypeError(

    TypeError: Python object could not be represented through the generic tracing type. Consider implementing the Tracing Protocol for it: <AutoCastVariable 'block1_conv1/kernel:0' shape
=(3, 3, 3, 32) dtype=float32 dtype_to_cast_to=float32>

I couldn't really find anything when I googled the error. Maybe (or hopefully) it's something easy to fix?

jamesdolezal commented 1 year ago

Hmmm - let me investigate. This looks like a separate issue. Can you paste the contents of the model params.json here?

jinnyjuice commented 1 year ago

Yes of course:

{
 "slideflow_version": "2.0.5",
 "project": "MyProject",
 "backend": "tensorflow",
 "git_commit": "ae6ad0e8937207efe60d23a400e88bf12f5db719",
 "model_name": "category-HP0",
 "full_model_name": "category-HP0",
 "stage": "training",
 "img_format": "jpeg",
 "tile_px": 299,
 "tile_um": 100,
 "max_tiles": 0,
 "min_tiles": 0,
 "model_type": "categorical",
 "outcomes": [
  "category"
 ],
 "input_features": [
  "age"
 ],
 "input_feature_sizes": [
  1
 ],
 "input_feature_labels": {
  "age": "float"
 },
 "outcome_labels": {
  "0": "major",
  "1": "minor"
 },
 "dataset_config": "project/datasets.json",
 "sources": [
  "MyProject"
 ],
 "annotations": "project/annotations.csv",
 "validation_strategy": "none",
 "validation_fraction": null,
 "validation_k_fold": 3,
 "k_fold_i": null,
 "filters": null,
 "hp": {
  "augment": "xyrj",
  "batch_size": 16,
  "drop_images": false,
  "dropout": 0,
  "early_stop": false,
  "early_stop_method": "loss",
  "early_stop_patience": 0,
  "epochs": [
   3
  ],
  "hidden_layer_width": 500,
  "hidden_layers": 0,
  "include_top": true,
  "l1": 0.0,
  "l1_dense": 0.0,
  "l2": 0.0,
  "l2_dense": 0.0,
  "learning_rate": 0.0001,
  "learning_rate_decay": 0,
  "learning_rate_decay_steps": 100000,
  "loss": "sparse_categorical_crossentropy",
  "manual_early_stop_batch": null,
  "manual_early_stop_epoch": null,
  "model": "xception",
  "normalizer": null,
  "normalizer_source": null,
  "optimizer": "Adam",
  "pooling": "max",
  "tile_px": 299,
  "tile_um": 100,
  "toplayer_epochs": 0,
  "trainable_layers": 0,
  "training_balance": "category",
  "uq": false,
  "validation_balance": "none"
 },
 "training_kwargs": {
  "save_predictions": "csv"
 }
}

This is btw only with one clinical variable (age). Thank you so much for taking care!

jinnyjuice commented 1 year ago

Nevermind, i think i got it working and am currently training a model with multiple clinical variables. Apparently it had nothing to do with the patch but rather with the conda environment that i re-installed (and obviously in some wrong way).

Thank you so much for your help! I will let you know how training and testing turns out.

BTW: Is it possible to have clinical variables and tfrecords both as input - and train for a linear outcome? I know that the keyword argument "input_header" is available in things like Project.train or Project.evaluate... But is there a way to pass that input_header argument to the sf.model.LinearTrainer? Or do I have to use the keyword argument "slide_input"? Apparently, it's supposed to be a dictionary... can i then just do a list of dictionaries? Such as:

csv = 'project/annotations.csv'
df = pd.read_csv(csv)
age_dict = df.set_index('slide').to_dict()['age']
sex_dict = df.set_index('slide').to_dict()['sex']
asa_dict = df.set_index('slide').to_dict()['asa']
height_dict = df.set_index('slide').to_dict()['height']
weight_dict = df.set_index('slide').to_dict()['weight']

multi_input = [age_dict, sex_dict, asa_dict, height_dict, weight_dict]

my_trainer = sf.model.LinearTrainer(
  hp=hp,
  slide_input=multi_input,
  outdir='outputs',
  labels=labels,
)

my_trainer.train(dataset1, None)

It looks like it's running, but i'm not sure if the clinical variables are really being processed... Do you know what I mean?

jamesdolezal commented 1 year ago

Glad to hear it!

Training to linear outcomes is super easy. All you have to do is choose a linear loss function in the hyperparameters (eg "mean_squared_error"), and an outcome that can be interpreted as a continuous variable, and it should just work. You can still use the same P.train() and P.evaluate() interface, and clinical variable input will still work, as well.

The LinearTrainer provides a low-level API functionality for building custom trainers and other DL frameworks - you shouldn't need to directly use it for anything routine.

jinnyjuice commented 1 year ago

Alright, thanks for your help! Yeah, i didn't see that i don't really need the LinearTrainer for linear outcome. Quick question: is it possible to train the MIL and CLAM models for a linear outcome measure? I've seen that there's this keyword argument bag_loss (Primary loss function) which can be either ‘ce’ or ‘svm’... it's not possible to change it to something like rmse / mean_squared_error, is it?

jamesdolezal commented 1 year ago

Training MIL models with linear outcomes is under development! (see PR https://github.com/jamesdolezal/slideflow/pull/287). The plan is to add this in version 2.1, which is still 1-2 months out.

jinnyjuice commented 1 year ago

I have another question: Apparently, all clinical variables that are used as input for the neural network are treated as float variables. For example, when I look into the params.json it looks like this:

"input_features": [
  "age",
  "sex",
  "asa",
  "bmi"
],
"input_feature_sizes": [
  1,
  1,
  1,
  1
],
"input_feature_labels": {
  "age": "float",
  "sex": "float",
  "asa": "float",
  "bmi": "float"
],

Would it make sense to change the parameters into telling the network that for example things like sex or asa unlike age or bmi are actually categorical (or ordinal) variables and not float? If so, how can I change it?

jamesdolezal commented 1 year ago

You can definitely mix float and categorical variables. Any variable that can be interpreted as a continuous variable (eg coded with 0 and 1) will be interpreted as float. Is this how "sex" and "ama" are encoded? If so, you can force categorical interpretation by changing "0" and "1" to "M" and "F", for example.

jinnyjuice commented 10 months ago

Hello James,

I have a very quick question: So I trained a model with default 3-fold cross-validation. It automatically created the splits.json with data[0]['strategy'] stating the strategy method, data[0]['patients'] summing up all patients and data[0]['tfrecords']['k-fold-1'], ['k-fold-2'] and ['k-fold-3'].

Now this might be a stupid question, but: how do I know which two thirds were used for training and which last third was used for validation? There's no such information in the splits.json stating something like: first run is a+b, val on c. second run is a+c, val on b. third run is b+c, val on a. It could be any order and I couldn't figure from the documentation that you provided.

Do you know what I mean?

The reason I want to know this is because I eventually want to generate heatmaps only for the validation group. Because I want to understand what parts of the tissue were relevant related to the validation results. For like the first fold, I would have to take the model that was trained on some two thirds - but I would have to exactly locate the last third of patients that was not used for training. Does that make sense? Maybe you could comment on this as well.

Would greatly appreciate it! Thank you so much so far.

Cheers

jamesdolezal commented 10 months ago

Hi Jinny - thanks for the question, this could be better clarified in the documentation.

The best way to determine what data was used for training/validation is to view the slide_manifest.csv file created in the model folder during training. This is a CSV file with three columns - the slide name, the outcome label, and the dataset (training/validation).

In addition to manually viewing the file, you can quickly and automatically pull the list of slides that were used for model training or validation using sf.util.get_slides_from_model_manifest(), specifying whether you want to retrieve the training or validation slides using the parameter dataset:

import slideflow as sf

model_path = '/path/to/saved_model'

val_slides = sf.util.get_slides_from_model_manifest(model_path, dataset='validation')

You can then use create a dataset from only those slides, and use that dataset for generating heatmaps, or other downstream tasks:

P = sf.Project(...)
val_dataset = P.dataset(..., filters={'slide': val_slides})

To answer your question more directly though, the splits.json has the slides/tfrecords split into the number of groups equal to your cross-fold (in your case, 3). For k-fold 1, the first group (A) is validation, and the remainder is training (B+C). For k-fold 2, the second group (B) is validation, and the remainder (A+C) is training. And so on.

I appreciate you asking, I realize now that I failed to include this information in the documentation. I'll add a section in the documentation explaining this more clearly.

Let me know if that makes sense or if I can help clarify further!

jinnyjuice commented 10 months ago

Thank you, James, this was super fast! Yes, it totally makes sense =)