Closed SFrav closed 2 years ago
I ended up combining the two npz folders and patchlet-info csvs with the following code:
patchpos = pd.read_csv(os.path.join(PROJECT_DATA_ROOT, 'patchlet-info.csv'))
patchneg = pd.read_csv(os.path.join(PROJECT_DATA_ROOT, 'patchlet-info-neg.csv'))
patchneg['chunk'] = patchneg['chunk'].str.replace('.npz', '_neg.npz')
patchpos.to_csv (os.path.join(PROJECT_DATA_ROOT, 'patchlet-info.csv'), index = False, header=True)
for f in glob.glob(r'path/to/patchlets_npz_neg/*.npz'):
os.rename(f, f.replace('.npz', '_neg.npz'))
copy_tree(os.path.join('field-delineation', 'input-data', 'patchlets_npz_neg'),
os.path.join(PROJECT_DATA_ROOT, 'patchlets_npz'))
Also, I shuffled three times instead of one - replicating line 272 of tf_data_utils.py (hope that's right?)
How do we combine positive and negative samples in the k-folds split stage?
I have created two separate patchlet and npz folders and have two separate patchlet-info.csv files. The npz files and patchlet-info data has the same file naming pattern. I'm not sure how to combine these. I don't think it would work to run the function twice and append patchlet-info csvs.