sentinel-hub / field-delineation

Field delineation with Sentinel-2 data from Sentinel-Hub and a ResUnet-a architecture.
MIT License
151 stars 53 forks source link

K-folds split with negative samples #15

Closed SFrav closed 2 years ago

SFrav commented 2 years ago

How do we combine positive and negative samples in the k-folds split stage?

I have created two separate patchlet and npz folders and have two separate patchlet-info.csv files. The npz files and patchlet-info data has the same file naming pattern. I'm not sure how to combine these. I don't think it would work to run the function twice and append patchlet-info csvs.

SFrav commented 2 years ago

I ended up combining the two npz folders and patchlet-info csvs with the following code:

patchpos = pd.read_csv(os.path.join(PROJECT_DATA_ROOT, 'patchlet-info.csv'))
patchneg = pd.read_csv(os.path.join(PROJECT_DATA_ROOT, 'patchlet-info-neg.csv'))
patchneg['chunk'] = patchneg['chunk'].str.replace('.npz', '_neg.npz')
patchpos.to_csv (os.path.join(PROJECT_DATA_ROOT, 'patchlet-info.csv'), index = False, header=True)

for f in glob.glob(r'path/to/patchlets_npz_neg/*.npz'):
    os.rename(f, f.replace('.npz', '_neg.npz'))

copy_tree(os.path.join('field-delineation', 'input-data', 'patchlets_npz_neg'), 
os.path.join(PROJECT_DATA_ROOT, 'patchlets_npz')) 

Also, I shuffled three times instead of one - replicating line 272 of tf_data_utils.py (hope that's right?)