Open dengxl0520 opened 1 year ago
@dengxl0520 do you have any update on this? I was shocked by the amount of over-engineering of their projects.
@dengxl0520 I have just read through the dataset_generator.py and it looks like the team could access to a CAMUS dataset with fully annotated sequence from ED to ES, it seems confuse to me because the public training set of CAMUS does not contain any of this information.
@gungui98 About the dataset, you can see the paper https://arxiv.org/abs/2112.02102 It mentioned a dataset called TED ,which with fully annotated sequence from ED to ES. But I still don't understand how to use this dataset...
@dengxl0520 Seem like they extend the original dataset to full cycle by manual annotation. I have email the author for the dataset but haven't receive the response yet.
@gungui98 you can download it from here (https://humanheart-project.creatis.insa-lyon.fr/ted.html)
@dengxl0520 I have successfully processed the dataset and got h5 file, first you have to run script with full cycle option, where the input dataset is from your provided link.
python dataset_generator.py --output ~/data/camus.h5 --sequence_type full_cycle ~/data/camus_full_cycle/TED/database/
I have also skip the k-fold part where I simply split the dataset into 80/10/10 for train test val for the function get_fold_subset_from_file
from vital/vital/data/camus/dataset_generator.py
into
def get_fold_subset_from_file(
cls, data: Path, fold: int, subset: Literal["training", "validation", "testing"]
) -> List[str]:
"""Reads patient ids for a subset of a cross-validation configuration.
Args:
data: Path to the CAMUS root directory, under which the patient directories are stored.
fold: ID of the test set for the cross-validation configuration.
subset: Name of the subset for which to fetch patient IDs for the cross-validation configuration.
Returns:
IDs of the patients that are included in the subset of the fold.
"""
# list_fn = data / "listSubGroups" / f"subGroup{fold}_{subset}.txt"
# # Open text file containing patient ids (one patient id by row)
# with open(str(list_fn), "r") as f:
# patient_ids = [line for line in f.read().splitlines()]
import glob
patient_ids = glob.glob(str(data / "*"))
# patient_ids = sorted(patient_ids)
train_set = patient_ids[:int(len(patient_ids) * 0.8)]
test_set = patient_ids[int(len(patient_ids) * 0.8):int(len(patient_ids) * 0.9)]
val_set = patient_ids[int(len(patient_ids) * 0.9):]
if subset == "training":
return train_set
if subset == "testing":
return test_set
return val_set
I will try to implement the correct and fixed k-fold part but this simply made thing run at first. PS: I have also trained a model with this file, but with CRISP project!
@gungui98 I try to modify the python file dataset_generator.py
like you, and i run the script then i meet other problem.
The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 363, in
main() File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 349, in main CrossValidationDatasetGenerator()( File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 118, in call self._write_patient_data(dataset.create_group(patient_id)) File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 176, in _write_patient_data data_x_proc = resize_image(data_x, self.target_image_size, resample=Resampling.BILINEAR) File "/home/dengxiaolong/miniconda3/envs/castor/lib/python3.10/site-packages/vital/utils/image/transform.py", line 22, in resize_image resized_image = np.array(Image.fromarray(image).resize(size, resample=resample)) File "/home/dengxiaolong/miniconda3/envs/castor/lib/python3.10/site-packages/PIL/Image.py", line 2955, in fromarray raise TypeError("Cannot handle this data type: %s, %s" % typekey) from e TypeError: Cannot handle this data type: (1, 1, 748), |u1
@dengxl0520 not really sure about your problem, but this is code that I have used, it could come from reading the image data:
https://gist.github.com/gungui98/364e8f77930880132dee9704aca9a90d
I want to use the CAMUS dataset on this project, but I have some problems: I use the castor/vital/vital/data/camus/dataset_generator.py to generate the HDF5 file, but I got an error:
how can I get the 'subGroup1_training.txt'?