Open camappel opened 2 months ago
@camappel, thanks for opening the notebook idea - it sounds great to test novel pre-build models from the DeepForest package.
The notebook incorporates suggestions from the closed issue #251, so I'm happy to support the submission of the notebook.
Please move to the preparation stage and contact here if you experience any issues.
Great, thanks! I'm going to use the same PR as the last one since the environment will be similar
@camappel - LGTM. Please feel free to explore the pooch
library (see docs) for fetching the suggested labelled dataset.
Hi @acocac, I have a couple of questions about the capacity of the binder environment, and how to structure the notebook accordingly.
Currently in the notebook, I fetch and process a large dataset from dataverse, partition by train/validate/test, then fine-tune the livestock detection model on the train/validate sets. I then evaluate the baseline and fine-tuned models on the test set.
My questions are:
So I think the notebook structure could be:
Please let me know your thoughts! Thanks
Hi @camappel - thanks for sharing updates!
- Can I download the entire dataset in the notebook? I believe you previously said to just download a couple of sample images for visualisation, but the evaluation step requires a whole dataset to get the relevant metrics. Maybe instead, I could just download the test set (10%) in the notebook for the evaluation section?
The main notebook should only download the test set. You could archive it in Zenodo, and refer to the original dataset (and respective license) within the metadata/description of the zenodo repository (see for instance the subset dataset used in the COSMOS-UK notebook).
- Can I train the model in the notebook? One previous notebook included the training step, but another just downloaded the weights. The problem with this, however, is that it does not demonstrate the training process, and I would like to show how to configure the model and create trainer (only a few lines of code).
I suggest adding a markdown cell where you highlight the training process (see an example here).
So I think the notebook structure could be:
The structure looks good to me. Thanks for your effort in validating/sharing progress of your submission.
@camappel we have started the PRE-REVIEW phase. Fingers crossed for a constructive feedback of your submission!
What is the notebook about?
This notebook will explore the capabilities of the DeepForest package. In particular, it will demonstrate how to:
The prebuilt live-stock model was trained on a limited dataset. According to the package's documentation, "the prebuilt models will always be improved by adding data from the target area". As such, this notebook will explore the improvement in the model's performance in live-stock detection from fine-tuning on local data.
Data Science Component
Submission type
Programming language
Checklist:
Additional information
An EDS notebook already exists on tree crown detection using DeepForest. This notebook is different because it focusses on the latest version of DeepForest (1.4.0), which includes a new prebuilt live-stock model, and also demonstrates how to fine-tune the model.