Closed espositoandrea closed 2 years ago
Due to some issues, we had to restart the work we've done related to this issue. I'll be taking over this ticket.
Due to the large size of the dataset, I'll be uploading only 14 positive PET scans and 20 negative PET scans based on the following IDs:
PET Scan's ID | Class |
---|---|
OAS30027_PIB_PUPTIMECOURSE_d1300 | Negative |
OAS30132_PIB_PUPTIMECOURSE_d0063 | Negative |
OAS30206_AV45_PUPTIMECOURSE_d3024 | Negative |
OAS30273_PIB_PUPTIMECOURSE_d0077 | Negative |
OAS30479_AV45_PUPTIMECOURSE_d2421 | Negative |
OAS30662_PIB_PUPTIMECOURSE_d1615 | Negative |
OAS30687_PIB_PUPTIMECOURSE_d0126 | Negative |
OAS30713_PIB_PUPTIMECOURSE_d0095 | Negative |
OAS30713_PIB_PUPTIMECOURSE_d1692 | Negative |
OAS30818_AV45_PUPTIMECOURSE_d1720 | Negative |
OAS30818_AV45_PUPTIMECOURSE_d2089 | Negative |
OAS30818_PIB_PUPTIMECOURSE_d0097 | Negative |
OAS30818_PIB_PUPTIMECOURSE_d1214 | Negative |
OAS30863_PIB_PUPTIMECOURSE_d1531 | Negative |
OAS30867_AV45_PUPTIMECOURSE_d4407 | Negative |
OAS30867_PIB_PUPTIMECOURSE_d0480 | Negative |
OAS30869_PIB_PUPTIMECOURSE_d0152 | Negative |
OAS30899_PIB_PUPTIMECOURSE_d0070 | Negative |
OAS30964_PIB_PUPTIMECOURSE_d1142 | Negative |
OAS30964_PIB_PUPTIMECOURSE_d1533 | Negative |
OAS30024_AV45_PUPTIMECOURSE_d0084 | Positive |
OAS30027_PIB_PUPTIMECOURSE_d2394 | Positive |
OAS30031_PIB_PUPTIMECOURSE_d0236 | Positive |
OAS30035_PIB_PUPTIMECOURSE_d3893 | Positive |
OAS30040_PIB_PUPTIMECOURSE_d4424 | Positive |
OAS30051_PIB_PUPTIMECOURSE_d0081 | Positive |
OAS30078_PIB_PUPTIMECOURSE_d0136 | Positive |
OAS30085_PIB_PUPTIMECOURSE_d1566 | Positive |
OAS30087_PIB_PUPTIMECOURSE_d0096 | Positive |
OAS30114_AV45_PUPTIMECOURSE_d0086 | Positive |
OAS30119_PIB_PUPTIMECOURSE_d1615 | Positive |
OAS30119_PIB_PUPTIMECOURSE_d2595 | Positive |
OAS30119_PIB_PUPTIMECOURSE_d3722 | Positive |
OAS30128_AV45_PUPTIMECOURSE_d0044 | Positive |
As we've presented to the Professors today (2021-11-04), we have successfully tracked the data-creation and data-processing pipeline. We should try and merge the DVC data-pipeline and the MLflow model-tracking, but we can think about it on a new ticket and eventually work on a different PR.
At the moment, we just uploaded the final pre-processed folds using DVC. We should port the pipeline as Python Scripts and provide the "raw" data that produced the final folds. Since the dataset is huge (over 2TB), we're going to just show a very small subset, as a "proof-of-concept" of the pipeline. We'll then simply apply that same pipeline using more data.