[BUG] there is some typo error

sumanttyagi commented 3 months ago

[ ] 1 on hugging face everywhere you have written v2 dataset , as per the git readme the model is trained on v2a ?https://github.com/LADI-Dataset/ladi-overview/issues even in github but the classes of v2a is mentioned as per technical write up .

Please give us clarity - or update the documentation

[ ] 2 - Also did you used the resized version (i.e. resized version was used for training ) for publishing the model or just the v2a dataset for training model ?
[ ] 3 If the raw images were of varied sizes, how did you choose the final dimensions? Did you use any sampling methods? If so, to what scale? I noticed that the input size of the images is 448x448. for LADI-v2-classifier-small
[ ] 4 - reference model reference model , Could you please provide more details on how the results of this paper differ from those of other models? Specifically, how did using 80% of the images impact the outcomes? Why did you choose not to train the model with the entire dataset? Additionally, can you provide any use cases for this approach publishing the reference model ? (i.e please share more details how paper results is different from other model ? using 80 % images how did it helped why did not just train model with all dataset why reference any use case to it ?)

sscheele commented 3 months ago

Hi Sumant - question 1 is answered in the README. Since that's something like the fourth question you've asked that's answered in the README, I'd like to gently ask you to make sure you've read it before asking any more questions. If you're not a native English speaker and find that difficult, you might ask an LLM to read it for you and answer your questions.

As you point out, further resizing the images is done internally in the code, so it shouldn't really matter whether we used the resized versions or not, but we did use the resized versions for most of our runs. Input sizes were determined by the base model.

We did train the models which are not labeled as 'reference' on the entire dataset. The other questions you're asking in your point 4 relate to the basics of why we use train and test splits - I would encourage you to find the answers by taking an ML course or asking an LLM.

I really do appreciate your interest in our work, but to save myself some time I will close any further issues you open with questions you could answer with an LLM.

sumanttyagi commented 3 months ago

@sscheele "Please perceive better don't overfit like AI models on my questions" regarding 3 did you resize the image to1800*1200 again resized to the base model which 448448 - am i right ? It's really concerning that changes are happening, especially since the Civil Air Patrol mostly uses DJI. Additionally, to ensure that ortho images work effectively over a larger area, what aspects should be preserved? as you being expert in vision should have known the alogrithims used for resized in your internal code that's play crucial role when you are folding the images 10x times from 4096 4096 to 448size 448

sumanttyagi commented 3 months ago

for 4 th in "ML course which i took ,i was never taught to train a model without split" - we must have ground truth train test val . You just trained a model on all data then tested on random unseen data . Published that model . You must a checked the result by yourself have you send those results to 43 people who annotated to verify results will it not add bias. Your whole research write is based on this model for performance where you skipped testing it on annotated data really you are claiming to use that model even highlight that in paper for comparisons are you in capacity like the expert 43 annotated people to understand every disaster ?

You know what bias it add up without asking to any LLM let me give you golden words -"Population mean and deviation of images must be same to sample training mean and deviation are required for robust model"

sumanttyagi commented 3 months ago

for 1 you knew it's typo you have given specific names to v2a_resized you just mentioned in readme that we do ---- why not carry this literature everywhere

sumanttyagi commented 3 months ago

Hi Sumant - question 1 is answered in the README. Since that's something like the fourth question you've asked that's answered in the README, I'd like to gently ask you to make sure you've read it before asking any more questions. If you're not a native English speaker and find that difficult, you might ask an LLM to read it for you and answer your questions.

As you point out, further resizing the images is done internally in the code, so it shouldn't really matter whether we used the resized versions or not, but we did use the resized versions for most of our runs. Input sizes were determined by the base model.

We did train the models which are not labeled as 'reference' on the entire dataset. The other questions you're asking in your point 4 relate to the basics of why we use train and test splits - I would encourage you to find the answers by taking an ML course or asking an LLM.

I really do appreciate your interest in our work, but to save myself some time I will close any further issues you open with questions you could answer with an LLM.

3 is the most critical one if you want this model to be used globally on other dataset people might be interested what input size image should be given at what resolution and aspect so that then need not to label and finetune the model . It will increase the scope make it more efficient in terms of usage

LADI-Dataset / ladi-overview

[BUG] there is some typo error #15