mantasu / face-crop-plus

Face aligner and cropper with quality enhancement and attribute parsing
https://mantasu.github.io/face-crop-plus/
MIT License
63 stars 7 forks source link

Disabling rotation and preserving original cropped face size #10

Open godisme1220 opened 4 months ago

godisme1220 commented 4 months ago

Hello,

Thank you for this amazing library! I have a question regarding the face cropping functionality. Currently, the cropped face images are resized to a fixed output size defined by output_size. However, I need the cropped face images to retain their original size without any resizing.

Is there a way to achieve this using the current implementation? If not, what modifications would you suggest to allow the face crops to be saved at their original size?

And is there anyway to not rotate the face?

Thank you for your help!

mantasu commented 4 months ago

Hey there, thanks for pointing this out! There should definitely be an easy way to do this in the future when I restructure the library. For now, I could only give you a "hacky" solution by adjusting the Cropper class. Just make sure to set batch_size=1

Adjusted Cropper Class (No Resize + No Rotate) ```python import cv2 import numpy as np from face_crop_plus import Cropper from face_crop_plus.utils import ( STANDARD_LANDMARKS_5, get_ldm_slices, as_numpy, as_tensor, read_images, as_batch, ) def estimate_scale_and_translate(src, dst, **kwargs): # Create arrays from the lists of tuples src = np.array(src) dst = np.array(dst) # Compute scale as the average ratio of distances between all pairs of points in dst and src scales = [] for i in range(len(src) - 1): for j in range(i + 1, len(src)): scale = np.linalg.norm(dst[j] - dst[i]) / np.linalg.norm(src[j] - src[i]) scales.append(scale) # 0.92 required to matche RANSAC estimation scale when using estimateAffine2D scale = np.mean(scales) * 0.92 # Compute translation as the difference between the scaled src and dst translation = np.mean(dst - scale * src, axis=0) # Create the transformation matrix transform_matrix = np.array( [[scale, 0, translation[0]], [0, scale, translation[1]]], dtype=np.float32, ) return transform_matrix, None class CropperNoResize(Cropper): def _reinit_landmarks_target(self, output_size): std_landmarks = STANDARD_LANDMARKS_5.copy() # Apply appropriate scaling based on face factor and out size std_landmarks[:, 0] *= output_size[0] * self.face_factor std_landmarks[:, 1] *= output_size[1] * self.face_factor # Add an offset to standard landmarks to center the cropped face std_landmarks[:, 0] += (1 - self.face_factor) * output_size[0] / 2 std_landmarks[:, 1] += (1 - self.face_factor) * output_size[1] / 2 # Pass STD landmarks as target landms self.landmarks_target = std_landmarks def process_batch(self, file_names: list[str], input_dir: str, output_dir: str): images, file_names = read_images(file_names, input_dir) if self.landmarks is None and self.det_model is None: indices, landmarks = list(range(len(file_names))), None elif self.landmarks is not None: indices, indices_ldm, paddings = [], [], None for i, file_name in enumerate(file_names): indices_i = np.where(file_name == self.landmarks[1])[0] if len(indices_i) == 0: continue indices.extend([i] * len(indices_i)) indices_ldm.extend(indices_i.tolist()) landmarks = self.landmarks[0][indices_ldm] elif self.det_model is not None: # Create a batch of images (with faces) and their paddings # images, _, paddings = as_batch(images, self.resize_size) original_size = (images[0].shape[1], images[0].shape[0]) images, _, paddings = as_batch(images, original_size) images, paddings = as_tensor(images, self.device), paddings # If landmarks were not given, predict, undo padding landmarks, indices = self.det_model.predict(images) landmarks -= paddings[indices][:, None, [2, 0]] if landmarks is not None and len(landmarks) == 0: return if landmarks is not None and landmarks.shape[1] != self.num_std_landmarks: slices = get_ldm_slices(self.num_std_landmarks, landmarks.shape[1]) landmarks = np.stack([landmarks[:, s].mean(1) for s in slices], 1) if self.enh_model is not None: images = as_tensor(images, self.device) images = self.enh_model.predict(images, landmarks, indices) images, groups = as_numpy(images), (None, None) if landmarks is not None: images = self.crop_align(images, paddings, indices, landmarks) if self.par_model is not None: groups = self.par_model.predict(as_tensor(images, self.device)) self.save_groups(images, file_names[indices], output_dir, *groups) def crop_align( self, images: np.ndarray | list[np.ndarray], padding: np.ndarray | None, indices: list[int], landmarks_source: np.ndarray, ) -> np.ndarray: transformed_images = [] border_mode = getattr(cv2, f"BORDER_{self.padding.upper()}") DO_NOT_ROTATE = True for landmarks_idx, image_idx in enumerate(indices): # Compute the bounding box of the face based on the landmarks width_standard = 0.68262291666666670 - 0.31556875000000000 height_standard = 0.8246919642857142 - 0.4615741071428571 x_min, y_min = np.min(landmarks_source[landmarks_idx], axis=0) x_max, y_max = np.max(landmarks_source[landmarks_idx], axis=0) width = int((x_max - x_min) / width_standard / self.face_factor) height = int((y_max - y_min) / height_standard / self.face_factor) output_size = (width, height) self._reinit_landmarks_target(output_size) if DO_NOT_ROTATE: # This transform does not rotate the cropped face transform_function = estimate_scale_and_translate elif self.allow_skew: transform_function = cv2.estimateAffine2D else: transform_function = cv2.estimateAffinePartial2D transform_matrix = transform_function( landmarks_source[landmarks_idx], self.landmarks_target, ransacReprojThreshold=np.inf, )[0] if transform_matrix is None: continue image = images[image_idx] transformed_image = cv2.warpAffine( image, transform_matrix, output_size, borderMode=border_mode ) transformed_images.append(transformed_image) # Normally stacking would be applied unless the list is empty # numpy_fn = np.stack if len(transformed_images) > 0 else np.array numpy_fn = lambda x: x return numpy_fn(transformed_images) ```

So you could just copy the given code into a file like cropper_no_resize.py and then simply use the updated class, e.g., with the demo data from this repository:

from cropper_no_resize import CropperNoResize

# Initialize cropper
cropper = CropperNoResize(
    batch_size=1,
    output_format="jpg",
    face_factor=0.7,
    strategy="largest",
)

# Process images in the input dir
cropper.process_dir(input_dir="demo/input_images")

Sorry about the messy solution, I hope to greatly rework the package sometime in the future. Let me know if anything else pops up and let's keep this open as an enhancement~

Also, face attribute grouping will not work with these modifications (further adjustments would be required) but I assume you would not need those

godisme1220 commented 4 months ago

Thanks for your amazing help! it works! But a little issue I encounter, that is the output image not exactly 1:1, how can I fix it?

Eugene regards

godisme1220 commented 4 months ago

Oh, I found a dumb way, but it works!

max_side = max(width, height)
width = max_side
height = max_side
output_size = (width, height)

I simply enforce the width and height same value base on which one is larger, then pass to self._reinit_landmarks_target(output_size)

mantasu commented 4 months ago

Haha yup, gald it works!