Closed gopi77 closed 7 years ago
Hi All
Used the below C++ script to copy jpg files from individual lfw named folders.
using namespace std;
int main() { ifstream file("list.txt"); string str;
string delimiter = "_0";
string str_split;
string combined;
string in;
string out;
string newExt = "jpg";
while (getline(file, str))
{
//cout << str << "\n";
str_split = str.substr(0, str.find(delimiter));
//cout << str_split << "\n";
string::size_type i = str.rfind('.', str.length());
if (i != string::npos) {
str.replace(i + 1, newExt.length(), newExt);
}
combined = str_split + "/"+ str;
in = "lfw_funneled/" + combined;
out = "images/" + str;
cout << in << "\t";
cout << out << "\n";
std::ifstream src(in, std::ios::binary);
std::ofstream dst(out, std::ios::binary);
dst << src.rdbuf();
}
getchar();
}
@gopi77 Thanks! I updated README to refer this post.
Summary:
Creating list.txt
any python implementation for doing this ?
Try with this:
from os import listdir, makedirs
from os.path import isfile, join, exists
import shutil
def getAllFileInFolder(folderPath, fileExtension):
totalExtension = ''
if fileExtension.startswith('.'):
totalExtension = fileExtension
else:
totalExtension = '.' + fileExtension
return [f for f in listdir(folderPath) if isfile(join(folderPath, f)) and f.endswith(totalExtension)]
def getAllFoldersInFolder(folderPath):
return [f for f in listdir(folderPath) if not isfile(join(folderPath, f))]
# Start
if __name__ == '__main__':
dataset_folder = 'lfw_funneled'
# get all folder names
folders = getAllFoldersInFolder(dataset_folder)
# get all file names
image_path = []
image_name = []
for folder in folders:
subfolder_path = join(dataset_folder, folder)
files = getAllFileInFolder(subfolder_path, 'jpg')
for f in files:
image_path.append(subfolder_path)
image_name.append(f)
# Generate the new folder structure
new_folder_path = 'raw'
if exists(new_folder_path):
shutil.rmtree(new_folder_path)
dst_path = join(new_folder_path, 'images')
makedirs(dst_path)
# copy all the images
for src_path, src_image in zip(image_path, image_name):
src = join(src_path, src_image)
dst = join(dst_path, src_image)
shutil.copyfile(src, dst)
print("Job finished")
how would we generate ppm for custom sets?
This link is dead: https://drive.google.com/file/d/1TbQ24nIc3GGNWzV_GGX_D-1WpI2KOGii/view
Can you please reupload it? I am having trouble seeing if my version of the database is working properly.
Using jgoenetxea script I've reuploaded a version:
Convert jpg to ppm
from PIL import Image
import os
from os import listdir
def getAllFileInFolder(folderPath, fileExtension):
totalExtension = ''
if fileExtension.startswith('.'):
totalExtension = fileExtension
else:
totalExtension = '.' + fileExtension
return [f for f in listdir(folderPath) if isfile(join(folderPath, f)) and f.endswith(totalExtension)]
dataset_folder = 'data/masks'
files = getAllFileInFolder(dataset_folder, 'jpg')
for f in files:
filename = f
filename = filename.replace('.jpg','')
print(f)
print(filename)
im = Image.open(open(dataset_folder+'/'+f, 'rb'))
im.save(dataset_folder+'/'+filename+'.ppm')
os.remove(dataset_folder+'/'+f)
Hi
I see the below comments in Readme
data/ raw/ images/ 0001.jpg 0002.jpg masks/ 0001.ppm 0002.ppm <<<<
But in the link http://vis-www.cs.umass.edu/lfw/part_labels/#download, I can download files like lfw-funneled.tgz. If I extract I get folders with person names & corresponding jpg files. Do I need to manually remove the folder structure & rearrange those files as 0001.jpg, 0002.jpg, etc... similar to the ppm files list ?
FYI, I got ppm files from https://drive.google.com/file/d/1TbQ24nIc3GGNWzV_GGX_D-1WpI2KOGii/view (will be great if you can share the method/code used to generate ppm files)
Regards Gopi. J