Use fake (generated) images for faker opposed to real people

dmadisetti commented 2 years ago

Describe the bug

It looks like the faker images are just using some scraped profile list from ages back, and stored on a cloudflare bucket https://cloudflare-ipfs.com/ipfs/Qmd3W5DuhgHirLHGVixi6V76LhCkZUz6pnFt5AJBiyvHye/avatar/<number>.jpg

Opposed to using real images (I recognized one of the people lol), faker should maybe use fake images. Some ideas are

FaceGan: https://github.com/barisgecer/facegan
AvatarGan: https://github.com/aakashjhawar/AvatarGAN
Using avatar images that have documented consent

I would suggest just overwriting the cloudflare images (that way you have backwards compatibility, and you are no longer exposing the images). For that, you need to find the maintainer of the bucket (and also maybe build in some redundancy? I think this community is probably familiar with maverick devs), which looks like bogus, so I opened an issue there too.

Marked as a bug, because you should fix this (even though you are not directly responsible for the images)- I don't think it's responsible to be posting pictures of 1250 random people without their consent.

Reproduction

-

Additional Info

Tracking down the source of the initial images see:

ST-DDT commented 2 years ago

IMO the avatar should return an actual/fake avatar instead of a persons face. For persons faces we should add a separate api, that only returns fake faces.

dmadisetti commented 2 years ago

You could potentially use identicons (which is what github uses) which is actually a hash based on username: https://github.com/stewartlord/identicon.js

The images could be generated from your list of fake names. I'd count this as a bonus since it could potentially allow for easier debugging (Alice/Bob/Camerons are always going to be associated with the same image).

prisis commented 2 years ago

We have some cases where real face images are needed, so a option to choose what type of avatar you want would be the best option

dmadisetti commented 2 years ago

Just an update from the guy on the README twitter com_messages_15894801-275195245

Shinigami92 commented 2 years ago

We have some cases where real face images are needed, so a option to choose what type of avatar you want would be the best option

Or at least provide an additional api, so we have image.avatar({ type }) and image.face(...) ...

dmadisetti commented 2 years ago

While this is an issue, I'm going to advocate for user images to be taken from across the race / gender spectrum. The current avatars are primarily white and male

import-brain commented 2 years ago

@Shinigami92 Should this be v6.2 or v7? I'm leaning towards v7, but I'll let you make the final decision

ST-DDT commented 2 years ago

I think we should not tag this for 6.2 as it is a major change in behavior. As for v7, I would like to use v7.0 mainly to do some cleanup and refactoring/renaming so I'm not sure whether this would fit there. We should give this its own feature version to give it its due attention, but I'm not sure when we will tackle this exactly. Maybe we can create a Future milestone for that, which would serve as a "Pull Backlog" for us and "Open for contributions" for others. What do you think?

import-brain commented 2 years ago

I think we should not tag this for 6.2 as it is a major change in behavior. As for v7, I would like to use v7.0 mainly to do some cleanup and refactoring/renaming so I'm not sure whether this would fit there. We should give this its own feature version to give it its due attention, but I'm not sure when we will tackle this exactly. Maybe we can create a Future milestone for that, which would serve as a "Pull Backlog" for us and "Open for contributions" for others. What do you think?

Yeah, I think the Future milestone is the way to go here. We should make one.

matthewmayer commented 2 months ago

I think a fairly easy way to do this while nsuring a diverse set of images would be:

Generate a list of prompts using something like

faker.helpers.fake(`profile picture of a {{number.int({"min":18, "max":80})}}-year-old {{person.sex}} from {{location.country}}`);

Feed the prompts into a AI image generator - eg Stable Diffusion or Adobe Firefly and generate 4 images at a time around 512x512
Manually pick the best image from each set
Name these as 1...100.jpg and dump in a Github repo under faker-js org
Update the avatar methods to point at these images via a CDN

This should only be an hour or two's work.

matthewmayer commented 2 months ago

I tried generating 100 images with Stable Diffusion 3 (50 male, 50 female)

They can be accessed like this: https://cdn.jsdelivr.net/gh/matthewmayer/sd3-avatars/generic/1.jpg up to https://cdn.jsdelivr.net/gh/matthewmayer/sd3-avatars/generic/100.jpg

Code i used to generate is at https://github.com/matthewmayer/sd3-avatars

ST-DDT commented 2 months ago

Looks like a good solution. We have to check the TOS though.

dmadisetti commented 2 months ago

Nice! Great starting point. Didn't exactly hit the mark on diversity though. Maybe 4-5 ethnically ambiguous people in the males + the last 2 that looked intentional (one Black guy and one South Asian)

Might seem overbearing; but from a product perspective- why limit your market

matthewmayer commented 2 months ago

Agreed its too white at the moment.

matthewmayer commented 2 months ago

Made a new branch https://github.com/matthewmayer/sd3-avatars/tree/country-prompt where i append a random from {{location.country}} to each prompt e.g. "profile picture of a 67-year-old woman from India" https://cdn.jsdelivr.net/gh/matthewmayer/sd3-avatars@country-prompt/generic/74.jpg

These folks seem more diverse.

matthewmayer commented 2 months ago

Screenshot 2024-09-17 at 22 14 29

dmadisetti commented 2 months ago

Awesome! Good job Stable diffusion for not being offensively stereotypical. LGTM, including the guy that's too cool for a shirt in a profile picture

ST-DDT commented 2 months ago

If we use jsdelivr, we have to add a link to their TOS to each method returing their links: https://www.jsdelivr.com/terms

matthewmayer commented 2 months ago

Would we want to make this a new method like avatarAI() and then have avatar() pick between avatarAI() and avatarGithub()?

Shinigami92 commented 2 months ago

Would we want to make this a new method like avatarAI() and then have avatar() pick between avatarAI() and avatarGithub()?

We could also think about to directly go to how these images are more target to -> person module. But I'm still a bit unsure if we should call it avatar, pfp, profilePicture, photo, image, or whatever. Sadly I'm busy the next few days attending a conference, so cant provide feedback for around 3 days. 🙁 👋

ST-DDT commented 2 months ago

I'm not sure whether I would name it avatar. IMO these are portraits -> faker.image.portrait().

Or should portraits have a blank background? What are your requirements for portraits? Do you just need a portrait, or do you need them in buisness or "passports" contexts? Or all of these?

We could still add them as a possibility to avatar.

And add a link to the method in the person module's description.

matthewmayer commented 2 months ago

I think avatar is fairly commonly used in software development as a synonym for a typically user-set "profile picture" etc so i don't mind the current name. faker.image.portrait() at first glance i'd think it would give me a portrait shaped image (rather than landscape).

matthewmayer commented 2 months ago

we should definitely cross-reference in the person module overview like we do for email addresses and phone numbers

Shinigami92 commented 2 months ago

I'm not sure whether I would name it avatar.

IMO these are portraits -> faker.image.portrait().

Or should portraits have a blank background?

What are your requirements for portraits?

Do you just need a portrait, or do you need them in buisness or "passports" contexts? Or all of these?

We could still add them as a possibility to avatar.

And add a link to the method in the person module's description.

portraits! That was the word I missed in my head.

I (/ my previous Company) need them like the images were generated by @matthewmayer. More or less frontal/orthogonal portraits of human faces. Detecting landmarks, analyzing ethicality, age and gender. Normal in or outside realworld backgrounds are welcome. So no ID card backgrounds required.

The company has tools on their own to extract the human out of the image and therefore remove the background.

matthewmayer commented 1 month ago

Blocked by #3131 next step is to set up the assets repo

faker-js / faker