VChristlein / icdar17code

Unsupervised Feature Learning for Writer Identification and Writer Retrieval
GNU General Public License v3.0
9 stars 6 forks source link

Getting issue in implementation #3

Open surbhi1209-code opened 3 years ago

surbhi1209-code commented 3 years ago

Hi, Getting issue in extracting patches from SIFT keypoints and need help to understand the steps required for implementation the model

VChristlein commented 3 years ago

Hi Surbhi,

can you please specify more what you mean?

Given the keypoints and a patchsize, in python it would be something like:

        all_patches = []
        # patch-size halves
        ps_h1 = patchsize / 2 
        ps_h2 = patchsize - ps_h1
        for kpt in keypoints:
             if kpt.pt[0] - ps_h1 < 0 or kpt.pt[1] - ps_h1 < 0\
                 or kpt.pt[0] + ps_h2 > img.shape[1] \
                 or kpt.pt[1] + ps_h2 > img.shape[0]:
                 continue
             patch = img[int(kpt.pt[1] - ps_h1) : int(kpt.pt[1] + ps_h2),\
                            int(kpt.pt[0] - ps_h1) : int(kpt.pt[0] + ps_h2)]

Best regards, V.C.

surbhi1209-code commented 3 years ago

Thanks for your reply. I have extracted the patches from keypoints but I am not able to understand what is the meaning of target. As I am having doubt that after extraction of patches, do I need to make cluster of those extracted patches??

Thanks & Regards Surbhi Parmar

On Tue, May 11, 2021, 12:47 PM vchristlein @.***> wrote:

Hi Surbhi,

can you please specify more what you mean?

Given the keypoints and a patchsize, in python it would be something like: all_patches = [] # patch-size halves ps_h1 = patchsize / 2 ps_h2 = patchsize - ps_h1 for kpt in keypoints: if kpt.pt[0] - ps_h1 < 0 or kpt.pt[1]

  • ps_h1 < 0\ or kpt.pt[0] + ps_h2 > img.shape[1] \ or kpt.pt[1] + ps_h2 > img.shape[0]: continue patch = img[int(kpt.pt[1] - ps_h1) : int(kpt.pt[1]
  • ps_h2),\ int(kpt.pt[0] - ps_h1) : int(kpt.pt[0] + ps_h2)]

Best regards, V.C.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/VChristlein/icdar17code/issues/3#issuecomment-837972001, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3XTHWYP5EOJQCUATUDWS3TNDKW7ANCNFSM44JDXXGQ .

VChristlein commented 3 years ago

You need to cluster the SIFT descriptors associated to the keypoint. Then, each cluster-id is the target to the respective patch.

Best regards, V.C.

surbhi1209-code commented 3 years ago

Thanks for the response, Extracted patches will be the input for the ResNet model. Is I am correct? How many input image I will give for training the model and whats the output after testing the result? I want to know each and every minute details of implementation. In cfg folder 2 .txt files are saved what is the role of these files . How you have labelled those files. If possible for you Kindly mention each steps in details so that I can better understand the implementation as I have some different task which is somehow related to this I need to complete and I am naive in this.

Regards Surbhi

VChristlein commented 3 years ago

Extracted patches will be the input for the ResNet model. Is I am correct?

Yes, exactly.

How many input image I will give for training the model and whats the output after testing the result?

I used 500k training patches with corresponding cluster ids as targets. The output are good features that generalize well - here for the task of writer identification, but in theory also for other tasks.

In cfg folder 2 .txt files are saved what is the role of these files

You mean the label files? They denote the class (=writer) of each train/test split of the ICDAR'17 WI dataset. I created them from the filenames.

If possible for you Kindly mention each steps in details

Please read the paper and try to understand as much as possible, then ask specific questions.

Best regards, V.C.

surbhi1209-code commented 3 years ago

Thank you for the clarification. Got it

Regards Surbhi

On Wed, May 12, 2021, 5:13 PM vchristlein @.***> wrote:

Extracted patches will be the input for the ResNet model. Is I am correct?

Yes, exactly.

How many input image I will give for training the model and whats the output after testing the result?

I used 500k training patches with corresponding cluster ids as targets. The output are good features that generalize well - here for the task of writer identification, but in theory also for other tasks.

In cfg folder 2 .txt files are saved what is the role of these files

You mean the label files? They denote the class (=writer) of each train/test split of the ICDAR'17 WI dataset. I created them from the filenames.

If possible for you Kindly mention each steps in details

Please read the paper and try to understand as much as possible, then ask specific questions.

Best regards, V.C.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/VChristlein/icdar17code/issues/3#issuecomment-839704017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3XTHQKC4CZW4VNQBEGNJLTNJSV5ANCNFSM44JDXXGQ .

surbhi1209-code commented 3 years ago

Hi, I am having difficulty in implementing VLAD encoding . Can you please help me with this? I need code for the same .

Regards S.P

On Wed, May 12, 2021, 10:30 PM surbhi parmar @.***> wrote:

Thank you for the clarification. Got it

Regards Surbhi

On Wed, May 12, 2021, 5:13 PM vchristlein @.***> wrote:

Extracted patches will be the input for the ResNet model. Is I am correct?

Yes, exactly.

How many input image I will give for training the model and whats the output after testing the result?

I used 500k training patches with corresponding cluster ids as targets. The output are good features that generalize well - here for the task of writer identification, but in theory also for other tasks.

In cfg folder 2 .txt files are saved what is the role of these files

You mean the label files? They denote the class (=writer) of each train/test split of the ICDAR'17 WI dataset. I created them from the filenames.

If possible for you Kindly mention each steps in details

Please read the paper and try to understand as much as possible, then ask specific questions.

Best regards, V.C.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/VChristlein/icdar17code/issues/3#issuecomment-839704017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3XTHQKC4CZW4VNQBEGNJLTNJSV5ANCNFSM44JDXXGQ .

surbhi1209-code commented 3 years ago

After training the model, how to implement VLAD enconding. I gone through paper but not able to understand the implementation process. Kindly help.

Thanks and regards Surbhi Parmar

On Thu, May 20, 2021, 6:35 AM surbhi parmar @.***> wrote:

Hi, I am having difficulty in implementing VLAD encoding . Can you please help me with this? I need code for the same .

Regards S.P

On Wed, May 12, 2021, 10:30 PM surbhi parmar @.***> wrote:

Thank you for the clarification. Got it

Regards Surbhi

On Wed, May 12, 2021, 5:13 PM vchristlein @.***> wrote:

Extracted patches will be the input for the ResNet model. Is I am correct?

Yes, exactly.

How many input image I will give for training the model and whats the output after testing the result?

I used 500k training patches with corresponding cluster ids as targets. The output are good features that generalize well - here for the task of writer identification, but in theory also for other tasks.

In cfg folder 2 .txt files are saved what is the role of these files

You mean the label files? They denote the class (=writer) of each train/test split of the ICDAR'17 WI dataset. I created them from the filenames.

If possible for you Kindly mention each steps in details

Please read the paper and try to understand as much as possible, then ask specific questions.

Best regards, V.C.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/VChristlein/icdar17code/issues/3#issuecomment-839704017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3XTHQKC4CZW4VNQBEGNJLTNJSV5ANCNFSM44JDXXGQ .

VChristlein commented 3 years ago

VLAD encoding is basically:

for each cluster k:
   sum up all residuals, i.e. differences of points (assigned to the cluster k) to the cluster center of k
concatenate all sum of residuals + normalization
surbhi1209-code commented 3 years ago

Thanks, I'll check it out.

On Fri, May 21, 2021, 2:31 AM vchristlein @.***> wrote:

VLAD encoding is basically: for each cluster: sum up all residuals of point to cluster center concatenate all sum of residuals + normalization

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/VChristlein/icdar17code/issues/3#issuecomment-845473886, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3XTHSCPNOXSQAPR2W2H3DTOV2CNANCNFSM44JDXXGQ .

surbhi1209-code commented 3 years ago

I want to clarify one thing VLAD encoding step is done on features which are extracted from Resnet training? Is it correct ??

Thanks & Regards S.P

On Fri, May 21, 2021, 3:20 PM surbhi parmar @.***> wrote:

Thanks, I'll check it out.

On Fri, May 21, 2021, 2:31 AM vchristlein @.***> wrote:

VLAD encoding is basically: for each cluster: sum up all residuals of point to cluster center concatenate all sum of residuals + normalization

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/VChristlein/icdar17code/issues/3#issuecomment-845473886, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3XTHSCPNOXSQAPR2W2H3DTOV2CNANCNFSM44JDXXGQ .

VChristlein commented 3 years ago

Yes, VLAD encoding comes on top of ResNet activation features.

surbhi1209-code commented 3 years ago

thanks alot for all the help!!!

surbhi1209-code commented 3 years ago

Thanks, I'll check it out.

On Sat, May 22, 2021, 3:32 PM vchristlein @.***> wrote:

Yes, VLAD encoding comes on top of ResNet activation features.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/VChristlein/icdar17code/issues/3#issuecomment-846385350, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3XTHVXM7WGO3KRT7IICY3TO56KNANCNFSM44JDXXGQ .