google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.44k stars 5.15k forks source link

Javascript Solutions Source Code #1408

Closed Choons closed 1 year ago

Choons commented 3 years ago

I'm happy to see that a javascript solution has been released for the facemesh, but can't seem to find the uncompiled javascript or typescript source code here on GitHub. Am I just missing it, or is it not posted?

mgyong commented 3 years ago

We don't plan to release the source code for MP JS Solutions API. Python source is out but not JS for now

Choons commented 3 years ago

?? if I'm not mistaken the python code is available? Anyway, without the code, then we need more documentation on the API than the little example reveals. I can't figure out the range of capabilities from that alone.

djthegr8 commented 3 years ago

We don't plan to release the source code for MP JS Solutions API. Python source is out but not JS for now

@mgyong why would that be though? The files can certainly be improved by open sourcing and anyways, people can access the minified code, so no scope for proprietary right?

djthegr8 commented 3 years ago

And regarding documentation, please review my PR #1434 which adds helpful notes about API and Utilities

afogel commented 3 years ago

any update on this? @mgyong (might have gotten lost b/c most people were probably out of the office around christmas/new years)

afogel commented 3 years ago

@mgyong @tyrmullen sorry to ping, just curious whether there's any new information about getting better documentation/more information about why MP doesn't plan on releasing the source for the JS Solutions API?

BSchrift commented 3 years ago

Even if the source code has not been released, has anyone documented the API or created typescript declarations?

mgyong commented 3 years ago

The problem is that we require bazel to support the --wasm option and additional work on our end to clear up our internal code in order to release. Currently there are no plans till maybe Q2 2021.

Choons commented 3 years ago

it's a bit silly really considering the code is already out there in the tensorflow.js models examples. The question is how similar or different your release is from that?

On Mon, Mar 29, 2021, 6:34 PM Ming Yong @.***> wrote:

The problem is that we require bazel to support the --wasm option and additional work on our end to clear up our internal code in order to release. Currently there are no plans till maybe Q2 2021.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/mediapipe/issues/1408#issuecomment-809794102, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD5GFYOEHBUX6WIVIOS2MMDTGEE77ANCNFSM4VB2C4XQ .

mgyong commented 3 years ago

@Choons It's quite different. tf.js does not use wasm. MP js solutions api does. If it was easy for us to release, we would have done it. Matter of bandwidth and priority

Choons commented 3 years ago

true. it doesn't expose the killer features promised in the write-up. quite different.

On Mon, Mar 29, 2021 at 6:57 PM Ming Yong @.***> wrote:

@Choons https://github.com/Choons It's quite different. tf.js does not use wasm. MP js solutions api does. If it was easy for us to release, we would have done it. Matter of bandwidth and priority

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/mediapipe/issues/1408#issuecomment-809802073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD5GFYP5WWPHTJD46UITKNDTGEHUZANCNFSM4VB2C4XQ .

chuoling commented 3 years ago

One thing we could do quickly is to include the TypeScript declarations (w/ inline comments) in the NPM package. @Choons would that help?

Choons commented 3 years ago

Yes! Anything you can pass over to us would be great. We're willing to help!

On Mon, Mar 29, 2021 at 7:36 PM chuoling @.***> wrote:

One thing we could do quickly is to include the TypeScript declarations (w/ inline comments) in the NPM package. @Choons https://github.com/Choons would that help?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/mediapipe/issues/1408#issuecomment-809817781, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD5GFYLQADEG32JVKNGWKRDTGEMKFANCNFSM4VB2C4XQ .

chuoling commented 3 years ago

Great. @mhays-google owns our JS solutions and will look into it.

Choons commented 3 years ago

I'll leave this here. https://github.com/spite/FaceMeshFaceGeometry

maybe people will find it useful

mhays-google commented 3 years ago

I am uploading our TypeScript exports file as part of the npm package for each and every solution. For any solution, the easiest way (I've found) to get to this file will be:

https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/index.d.ts https://cdn.jsdelivr.net/npm/@mediapipe/control_utils/index.d.ts https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/index.d.ts https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh/index.d.ts https://cdn.jsdelivr.net/npm/@mediapipe/face_detection/index.d.ts https://cdn.jsdelivr.net/npm/@mediapipe/pose/index.d.ts https://cdn.jsdelivr.net/npm/@mediapipe/hands/index.d.ts https://cdn.jsdelivr.net/npm/@mediapipe/holistic/index.d.ts

...et. al.

Even if you are not familiar with TypeScript, this should be pretty decipherable.

While this is not full documentation, per se, it should give you visibility into what can be called within the API. This is a quick fix to address what is obviously a very opaque interface. I will add a README.md to the packages giving these same directions shortly, but I wanted to have this to you by end of day. I'll look into better documentation in general.

mhays-google commented 3 years ago

Regarding the full release of the source code, this is being discussed, I personally support it, but there are a few hurdles.

Hopefully releasing the index file solves a few immediate issues -- thank you for the patience with that, by the way. And we'll look into what it will take to release more of the code (or at least the unobfuscated javascript).

Choons commented 3 years ago

Much appreciated, Michael! Yeah I love web assembly as a concept, but without documentation of what the wasm modules expose, it's just shooting in the dark trying to guess what's in the API. Also love the concept of using emscripten/llvm on c++ to create javascript/wasm, but in practice I have found the reality is often more difficult to create than just porting the code manually to javascript or Typescript, or Assemblyscript and then to wasm for the performance gains. Maybe the community here can contribute in those areas of the API that are lacking due to the work load of the developers.

afogel commented 3 years ago

THIS IS AWESOME!! Thanks @mhays-google

BSchrift commented 3 years ago

Thanks @mhays-google! This is so helpful.

I do have one more quick question: is there a reason why the package.json for the npm package doesnt specify a "main" or "typings" field?

It would be great to be able to say, for example, import { Hands } from '@mediapipe/hands' and have typescript know where to get the js code and the associated type declarations. Looking at the distributed package.json I think it would just be a matter of setting:

"main": "hands.js", 
"typings": "index.d.ts",

or the equivalent in each package, unless I'm misunderstanding how the packages are structured.

lostfictions commented 3 years ago

It's really unfortunate that there's no better documentation available, especially for the helper libraries (@mediapipe/drawing_utils, @mediapipe/camera_utils, @mediapipe/control_utils). Without them, it's remarkably hard to figure out how one would go about implementing drawing code -- for example, drawConnectors is totally opaque since it's shipped as a Closure-minified module, and there's no explanation or documentation elsewhere for how landmark indices are meant to be interpreted. As it stands, MediaPipe JS seems to either be a toy to show that a browser implementation is theoretically possible, or... a solution non-googlers are supposed to reverse-engineer?

afogel commented 3 years ago

@lostfictions unfortunately, it seems like the team is juggling a ton in active development, which is part of the reason that the documentation is sparse. Those of us who are trying to use it are reverse engineering some elements of it from the minified code, but with a bit of effort, it's eminently doable. That said, I'd recommend you peruse the issues, as the team has released some documentation (see https://github.com/google/mediapipe/issues/1408#issuecomment-810652766) and @djthegr8 has done a great job documenting the utils in another PR (thanks again, y'all!!)

luizjr commented 2 years ago

It's really unfortunate that there's no better documentation available, especially for the helper libraries (@mediapipe/drawing_utils, @mediapipe/camera_utils, @mediapipe/control_utils). Without them, it's remarkably hard to figure out how one would go about implementing drawing code -- for example, drawConnectors is totally opaque since it's shipped as a Closure-minified module, and there's no explanation or documentation elsewhere for how landmark indices are meant to be interpreted. As it stands, MediaPipe JS seems to either be a toy to show that a browser implementation is theoretically possible, or... a solution non-googlers are supposed to reverse-engineer?

The lack of documentation is very bad, I myself am suffering a lot to use camera_utils because it is minified and it is difficult to read and understand what is happening, it only takes the standard webcam.

I would like to pass another camera.

Also, I haven't found anywhere a suitable way to use the media pipe pose with an image or video source other than a camera for example.

It's making me give up using mediapipe in javascript.

tyrmullen commented 2 years ago

I think this is a common source of confusion (and issues). Most of the *_utils mini-libraries are purely helpers for allowing us to quickly make CodePen demos; we do not plan on supporting them as real APIs, and they really shouldn't be relied on for anything more complicated than toy/demo cases.

However, the JS Solutions themselves should be extremely flexible, and while actual documentation is currently very lacking, we hope that browsing the definition files mentioned by @afogel should be sufficient for answering most developer questions as to usage. That's where all of the API calls can be found (along with descriptive comments as to intended usage).

For example, @lostfictions, if you take a look at https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh/index.d.ts, you'll see that we export groups of landmark connections, like FACEMESH_RIGHT_EYE. So logging these will tell you both (a) which landmarks correspond to the right eye of a FaceMesh result, and (b) which of those landmarks should be connected in order to draw the right eye for visualization purposes The drawing_utils code just draws things (lines or circles usually) using these "groups" of indices to index into the actual landmark locations. The keypoints in @mediapipe/pose/index.d.ts should be even more straightforward.

And to answer your question, @luizjr: The actual method for sending input into the Solutions (.send()) is quite flexible. If you look at the pose defines, you can see that the input image type is actually defined as export type InputImage = HTMLVideoElement | HTMLImageElement | HTMLCanvasElement;, so whatever camera code you decide to set up on your end (i.e. via getUserMedia), as long as you grab a corresponding HTMLVideoElement (or decide to render it first to an HTMLCanvasElement), then MediaPipe will process it just fine. In fact, although it's not listed, if you really wanted to, you should even be able to use ImageBitmap as input if you really needed to (this was discussed in another issue).

luizjr commented 2 years ago

And to answer your question, @luizjr: The actual method for sending input into the Solutions (.send()) is quite flexible. If you look at the pose defines, you can see that the input image type is actually defined as export type InputImage = HTMLVideoElement | HTMLImageElement | HTMLCanvasElement;, so whatever camera code you decide to set up on your end (i.e. via getUserMedia), as long as you grab a corresponding HTMLVideoElement (or decide to render it first to an HTMLCanvasElement), then MediaPipe will process it just fine. In fact, although it's not listed, if you really wanted to, you should even be able to use ImageBitmap as input if you really needed to (this was discussed in another issue).

@tyrmullen your explanation helped me a little but I still have another question.

With camera_utils I have the onFrame() property to do pose.send()

If I do pose.send() without onFrame() it just renders once. How to send the frames properly?

tyrmullen commented 2 years ago

You'll have to call pose.send() for every frame in your video that you want to render. Browser calls like requestAnimationFrame() are probably the easiest way to repeatedly poll for video frames.

ahsenkh commented 2 years ago

@mhays-google with regards to this comment, can you please also link the type files in the npmjs pages?

I had a hard time finding types for face_detection solution, until I stumble across this git issue.

liufsd commented 2 years ago

+1

ayoub-root commented 2 years ago

any updates ?

alexandernst commented 2 years ago

Almost hitting the 2 years mark and still no source code. Google is totally misleading everybody when stating that this is open source. Either release the code or clearly state that it's not open source.

image

wheelie33 commented 1 year ago

lease the source code for MP JS Solutions API. Python source is out but not JS for n

Why?

wheelie33 commented 1 year ago

ap as input if you really needed to (this was discussed in anothe

Can you provide a working example of the HTMLImageElement? I'm trying a VERY VERY small prototype and it has some unhandled exceptions.

esinanturan commented 1 year ago

Regarding the full release of the source code, this is being discussed, I personally support it, but there are a few hurdles.

Hopefully releasing the index file solves a few immediate issues -- thank you for the patience with that, by the way. And we'll look into what it will take to release more of the code (or at least the unobfuscated javascript).

@mhays-google Any new regarding the release ?

kuaashish commented 1 year ago

@Choons,

JS Source is available for MP Tasks, And Will not be published for other Solutions.

afogel commented 1 year ago

Tasks

Hi @kuaashish, thanks for the update! Based on my read through the tasks documentation, does that meant that we would effectively create out own holistic model using a tasks pipeline that we define for ourselves?

How much less efficient will it be to use the pose, face, and hands tasks in conjuction, as compared to using the holistic model? Thanks :)

Choons commented 1 year ago

it's pretty much useless. I moved on to other code a long time ago.

alexandernst commented 1 year ago

So long for the "open source" part of this project

kuaashish commented 1 year ago

@Choons,

Thanks for the confirmation. If this is no longer issue from your end, Can we move ahead and close the this?

github-actions[bot] commented 1 year ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 1 year ago

This issue was closed due to lack of activity after being marked stale for past 7 days.