Open Sebastiaan-Alvarez-Rodriguez opened 5 years ago
Hi Sebastian! I'm sorry for a late response. Is the issue still relevant? This looks like a bug, I will took a look at it.
As for apk already exists
warnings, those are caused by apks with the same package names in the repo, so don't worry about this.
Hi! I just figured: Does az include conflicting selected items only once?
Maybe, during my downloads back then, there was a different number of name conflicts, resulting in a different amount of apks.
Not exactly. After downloading an apk, az tries to save it by the<package_name>.apk
name. If it already exists in a directory, meaning apk with such a package name has already been downloaded, it is being saved by <package_name+sha1>.apk
name.
The results can vary if you don't use the same seed argument. But the number of downloaded apks should be the same. And it is weird that you asked for 1000 apks and got 1136.
A bit off point but I am really interested: What happens if a double name collision occurs? Does it get saved as <package_name+sha1+sha1>.apk
?
I found it a bit strange myself too. I downloaded all apks to a new, empty directory, for as far as I can recall. During downloading, I did not use a seed argument.
Sadly, I already cleaned everything, so there are no logs to study further. Maybe it is better to close this issue for now. Perhaps someone else will experience this behaviour as well, and open an issue
A bit off point but I am really interested: What happens if a double name collision occurs? Does it get saved as <package_name+sha1+sha1>.apk?
This shouldn't be the case as sha1 is unique for a repository, as far as I know. If it is not it will be rewritten, which is still ok, because obviously contents would be the same. Though such an approach is a bit inefficient.
And it is weird that you asked for 1000 apks and got 1136. Now I see you asked for 10000 not 1000. Then it's ok to get less, because there could be only so much apks matching the criteria.
Also keep in mind that additionally to apks you will get metadata.csv and log.log files. Maybe this somehow played a role. Did you wipe everything in a directory between experiments by the way? I will try to reporduce your experiment a bit later.
I did indeed wipe everything in the directory between experiments as far as I recall
hello please help me, how i can create .az file in windows10 ?
Hello, these two commands were run:
az -n 10000 -d 2018-06-01: -s :3000000 -vt 3:500000 -t 128
az -n 10000 -d 2018-06-01: -s :3000000 -vt 3:500000 -t 2
Withls | wc-l
, the first command produced 1138 files, while the second command produced 1136.No environmental changes were made. Only difference is the amount of threads. Many warnings
apk with pkg <pkg_name> already exists
were given in both cases.Is your code thread safe? Can race conditions occur?
It would seem that multiple threads get assigned to the same download entry in the dataset.
Please fix this issue