Closed INF800 closed 2 years ago
Hmm, found this link from somewhere https://mega.nz/folder/wCpSzSoS#RXzIlrv--TDt3ENZdKN8JA
Hi @INF800 hope this is enough information to start:
I am stuck at the first step currently. (Because I am also looking at the training code which I will use in future)
These two files are actually downloadable
- The new download source of RSICD-MEGA
dataset.json (2.8MB) imgs.rar (252.9MB)
- The new download source of Sydney-captions and UCM-catpions-MEGA.
annotations_rsicd.rar (836KB) RSICD_images.rar (459.8MB)
And also, can I know what are we using for searching in HF spaces? The speed is appreciable.
We didn't use the MEGA data these appear to have been added after we were done with the project. Not sure if one is a subset of the other. If you are going to retrain with more data, then might make sense to download both these archives and check.
And as I mentioned earlier, we are using NMSLib for searching in the HF-spaces demo.
@sujitpal for the data used, does the RSICD data also contain Sydney and UCM-Merced data? If not, did you use all three in your training or only the RSICD data?
For the training I believe we used RSICD plus the Sydney and UCM-Merced datasets, but I will defer to @arampacha for the authoritative answer.
IIRC the data is not overlapping and we used all 3 datasets. This is also consistent with model cards we have. And don't see a reason not to trust those ;)
Hi, I am planning to build a (scalable) vector search engine using this model. I need to understand couple of things for it. Will you guys be helping me?
These are the current challenges I face
gdrive/MyDrive/RSICD/RSICD_images.rar
. Drive link in RSICD_optimal is prompting "request access"