raw.dvc currently track the entire folder data/raw and while it is clean and convenient this file is opaque to Git. So it is possible for raw.dvc to not track the correct files during merge. So far it has been manageable only due to the fact that it has been mostly Ayyub who keep this file up-to-date. It can quickly become a source of annoyance once more people modify this file in different PRs.
One way to remedy this is to track individual files in data/raw folder, which mean there will be one .dvc file for each file found in data/raw. Care must be taken to store the dvc files so that the repo won't become too messy but it will eliminate any possibility of merge conflicts in this area. @ayyubibrahimi does that sound good to you?
raw.dvc
currently track the entire folderdata/raw
and while it is clean and convenient this file is opaque to Git. So it is possible forraw.dvc
to not track the correct files during merge. So far it has been manageable only due to the fact that it has been mostly Ayyub who keep this file up-to-date. It can quickly become a source of annoyance once more people modify this file in different PRs.One way to remedy this is to track individual files in
data/raw
folder, which mean there will be one.dvc
file for each file found indata/raw
. Care must be taken to store the dvc files so that the repo won't become too messy but it will eliminate any possibility of merge conflicts in this area. @ayyubibrahimi does that sound good to you?