pmh9960 / iColoriT

Official PyTorch implementation of "iColoriT: Towards Propagating Local Hint to the Right Region in Interactive Colorization by Leveraging Vision Transformer." (WACV 2023)
MIT License
68 stars 14 forks source link

the difference between DataTransformationFixedHint function and DataTransformationFixedHintContinuousCoords function #9

Open HONGJINLYU opened 1 year ago

HONGJINLYU commented 1 year ago

During the validation, inference, and testing phases, we can apply the DataTransformationFixedHint function to specify fixed coordinates. However, I've noticed another function named DataTransformationFixedHintContinuousCoords. Judging from its name, it appears to be designed for continuous coordinates.

As the released code uses the RandomHintGenerator function during the training phase to generate random hints, it's reasonable to assume that the same trained model should be capable of handling both sparse and continuous hints at the same time. If my understanding is incorrect, I would greatly appreciate your clarification.

Then, I have a couple of questions:

(1) Could you kindly explain the distinction between DataTransformationFixedHint and DataTransformationFixedHintContinuousCoords? I'm curious about the need for a specific function for continuous hints.

(2) The primary difference between the two functions seems to be an additional line of code in the call function: hint_coords = [hint_coords[0][:idx] for idx in range(len(hint_coords[0]) + 1)] As a result, the coordinates text file might have a different format compared to that of DataTransformationFixedHint. Would you be able to clarify the specific format that DataTransformationFixedHintContinuousCoords function requires? An illustrative example would be immensely helpful.

(3) Will the trained model based on randomly generated hints perform differently on sparse hints versus continuous hints?

I genuinely appreciate your assistance and insights. Thank you in advance for your kind response! Best Regards HONGJIN

pmh9960 commented 1 year ago

Hi, thanks you for your interest in our work, and we apologize for the confusion.

As you pointed out regarding the distinction, the original purpose of DataTransformationFixedHintContinuousCoords is the evaluation of point-interactive performance in a "sequential" context within the same image (perhaps a more clarifying name could be DataTransformationFixedHintSequentialCoords). Within our paper, the quantitative results of point-interactive colorization have been quantified using randomly selected points, based on the varying point counts for randomly selected images (x-axis in the figure). However, for a more authentic assessment of point-"interactive" performance, introducing points sequentially might offer a more realistic setup. We deliberated on evaluating our model using sequentially introduced points; nevertheless, the observed differences were not significant.

In my perspective, considering more lifelike scenarios (evaluating with DataTransformationFixedHintContinuousCoords) could potentially provide a promising avenue for research in point-interactive tasks.

Best regards, Park