google-coral / libedgetpu

Source code for the userspace level runtime driver for Coral.ai devices.
Apache License 2.0
179 stars 60 forks source link

Update libedgetpu repo for compatibility of recent versions of Tensorflow. #60

Closed feranick closed 5 months ago

feranick commented 6 months ago

This PR:

This is in relation to issues: https://github.com/tensorflow/tensorflow/issues/62371 and https://github.com/google-coral/libedgetpu/issues/53

Skillnoob commented 6 months ago

@dmitriykovalev can you or some other maintainer please review this pr. Merging this would fix many of the issues users of the coral edge tpu have with newer tensorflow versions and python versions.

cocoa-xu commented 6 months ago

@dmitriykovalev can you or some other maintainer please review this pr. Merging this would fix many of the issues users of the coral edge tpu have with newer tensorflow versions and python versions.

This repo seems to be not maintained for a long time, and I have to create my own fork and maintain the code there.

In that repo, I've also fixed the support for armv6 devices and added support for riscv64 devices, which I can create another PR. This should make it much easier for many single board computers, like some of old RPIs, mangopi and some Starfive boards, to have access to libedgetpu.

stakach commented 6 months ago

this is awesome. @dmitriykovalev @PeterMalkin maybe allow the community to take over maintenance if Google is no longer interested?

Skillnoob commented 6 months ago

this is awesome. @dmitriykovalev @PeterMalkin maybe allow the community to take over maintenance if Google is no longer interested?

I fully agree. They should find someone who is interested in maintaining the project and can be trusted to not do anything malicious. Perhaps someone else from google. One maintainer should be enough to keep the lib up to date enough so it doesn't break again.

feranick commented 6 months ago

Copying @pkgoogle, hoping this can move forward.

pkgoogle commented 5 months ago

I don't have access to this repo, but I'll see if I can help bring some attention to this.

feranick commented 5 months ago

Thanks, @pkgoogle, much appreciated.

jakew009 commented 5 months ago

I don't have access to this repo, but I'll see if I can help bring some attention to this.

I think everyone using this project would be hugely grateful if you could.

There are lots of people using Corals in commercial projects with the chips soldered to PCB that really are being left in the lurch :(

pkgoogle commented 5 months ago

Hi @jakew009, Thanks for the information -- if you have any data on how many people this is affecting (besides what I can gather from this thread) I can push harder on my end.

feranick commented 5 months ago

Hi @pkgoogle, this is just to give you a bit more context. To function, edgetpu needs both the libedgetpu library (and some functionality is only available with the optional pycoral/libcoral libraries). the libedgetpu library was last updated years ago, and it only supports old versions of TF (2.5) and Python. AS most platform (OS, python3, etc) have moved on, getting machines to use libedgetpu is basically impossible and with it the actual use of the edgetpu itself. For example, Raspberry Pi 4/5 ships with Python3.11, yet there is no support for it.

One may say: "well, the repo is there, compile your own libraries". Unfortunately, that won't work as the repo needs updates to make it viable and in sync with current TF and Python (and OS). This is what my PR provides.

Furthermore: One would understand that if Coral EdgeTPU were discontinued, than lack of support would be frustrating (as discontinuing sales should not mean discontinuing support) but a fact of life. And yet, not only EdgeTPU is not discontinued, but pretty much still sold by several channels (including the official coral.ai). In fact, I have email exchange with the sales team for the EdgeTPU platform (which is operated by ASUS), that confirms that EdgeTPU is absolutely not discontinued and according to them fully supported by the google team. This is of course not true, as I gather, but it shows a discontinuity in how the platform is managed. In essence this is is TPU you can buy (even through professional bulk sales) but may not be able to run it anywhere.

Besides, in the age where every company is launching their NPUs with grand fanfare, it's sad to see a pioneer gadget left with no support. People may see it as a a great opportunity to get one only to be left with an useless device.

I hope this helps. Thanks very much, your push is much appreciated.

Nicola

P.S. I always thought the edgeTPU would have made it a great AI chip for, say, a chromebook. Too visionary, I guess.

On 2/28/24 2:00 PM, pkgoogle wrote:

Hi @jakew009 https://github.com/jakew009, Thanks for the information -- if you have any data on how many people this is affecting (besides what I can gather from this thread) I can push harder on my end.

— Reply to this email directly, view it on GitHub https://github.com/google-coral/libedgetpu/pull/60#issuecomment-1969652135, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIY7ZQZLCZF3DWARRPGHTYV55EXAVCNFSM6AAAAABCTNMRJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRZGY2TEMJTGU. You are receiving this because you commented.Message ID: @.***>

--------------7sWOA1IvVmq4dprI7CU0c5G5 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

Hi @pkgoogle, this is just to give you a bit more context. To function, edgetpu needs both the libedgetpu library (and some functionality is only available with the optional pycoral/libcoral libraries). the libedgetpu library was last updated years ago, and it only supports old versions of TF (2.5) and Python. AS most platform (OS, python3, etc) have moved on, getting machines to use libedgetpu is basically impossible and with it the actual use of the edgetpu itself. For example, Raspberry Pi 4/5 ships with Python3.11, yet there is no support for it.

One may say: "well, the repo is there, compile your own libraries". Unfortunately, that won't work as the repo needs updates to make it viable and in sync with current TF and Python (and OS). This is what my PR provides.

Furthermore: One would understand that if Coral EdgeTPU were discontinued, than lack of support would be frustrating (as discontinuing sales should not mean discontinuing support) but a fact of life. And yet, not only EdgeTPU is not discontinued, but pretty much still sold by several channels (including the official coral.ai). In fact, I have email exchange with the sales team for the EdgeTPU platform (which is operated by ASUS), that confirms that EdgeTPU is absolutely not discontinued and according to them fully supported by the google team. This is of course not true, as I gather, but it shows a discontinuity in how the platform is managed. In essence this is is TPU you can buy (even through professional bulk sales) but may not be able to run it anywhere.

Besides, in the age where every company is launching their NPUs with grand fanfare, it's sad to see a pioneer gadget left with no support. People may see it as a a great opportunity to get one only to be left with an useless device.

I hope this helps. Thanks very much, your push is much appreciated.

Nicola

P.S. I always thought the edgeTPU would have made it a great AI chip for, say, a chromebook. Too visionary, I guess.



On 2/28/24 2:00 PM, pkgoogle wrote:

Hi , Thanks for the information -- if you have any data on how many people this is affecting (besides what I can gather from this thread) I can push harder on my end.


Reply to this email directly,
view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: <google-coral/libedgetpu/pull/60/c1969652135@github.com>


--------------7sWOA1IvVmq4dprI7CU0c5G5--

feranick commented 5 months ago

@pkgoogle if useful I can pass along my email exchange with the sales team (my email: feranick@gmail.com)

Skillnoob commented 5 months ago

@feranick I have recently written a guide on how to get the usb accelerator working with ultralytics yolov8 based on your updated libs which can be found here. This may be helpful for people trying to get the tpu working.

jakew009 commented 5 months ago

Hi @jakew009, Thanks for the information -- if you have any data on how many people this is affecting (besides what I can gather from this thread) I can push harder on my end.

Thanks for replying.

I can't answer for other people, but for just us, we have about 1200 Google Corals in production, and a further 1000 or so in stock (because we had to forward buy 2 years in advance, because they were so hard to get hold of). These are the units that are physically soldered onto our PCBs that we sell to customers.

It's really scary for a small company to have a chip soldered onto PCBs that we have invested hundreds of thousands of into, and to be faced with the possibility that Google might completed abandon the software required to make them work.

I accept that Google are no longer developing new Coral products, but it doesn't seem too much to ask that they at least maintain the software required to make these chips work.

I'm also happy to pass on the email exchanges we have had with the Asus support team (who seem to have no idea what is going on other than saying that Google have stopped replying to their emails as well). My email is jake@ruggednetworks.co.uk.

I have also had some contact with Efren Robles from the Coral team before (https://www.linkedin.com/in/efrenrobles/) - I'm not sure if he is still involved but it might be worth reaching out to him as he was involved in distribution to commercial enterprises.

feranick commented 5 months ago

@Namburger Thank you. The fork with this PR may have some redundancies, but I am happy to align it with Google guidelines. Just let me know.

BTW, as you also noticed, the current head is built against the upcoming TF 2.16.0 (currently in rc0). I will keep updating it until it reaches stable, yet you may find a libedgetpu "stable" release in here:

https://github.com/feranick/libedgetpu/releases/tag/v16.0-TF2.15.0-1

Namburger commented 5 months ago

@feranick thank you for your contribution and keeping the project updated.

I haven't been involved with google-coral for a while now, however I just regained access to maintain this repo and I'd like to commit to keeping it healthy. I'm not sure if I'll be able to commit code changes, however I can help merge changes that are necessary to keep this project healthy and updated.

Namburger commented 5 months ago

@Namburger Thank you. The fork with this PR may have some redundancies, but I am happy to align it with Google guidelines. Just let me know.

BTW, as you also noticed, the current head is built against the upcoming TF 2.16.0 (currently in rc0). I will keep updating it until it reaches stable, yet you may find a libedgetpu "stable" release in here:

https://github.com/feranick/libedgetpu/releases/tag/v16.0-TF2.15.0-1

@feranick Do you think it make sense for this PR to be on a stable release and later you can submit another PR once we have a stable TF 2.16.0?

feranick commented 5 months ago

Thanks again, @namburger, this is great news.

I am happy to help. Ideally some code changes are needed (in the PR) to make the library viable going forward. I am sure you can cherry pick commits from the PR to get there.

On 2/29/24 3:45 PM, Nam Vu wrote:

@feranick https://github.com/feranick thank you for your contribution and keeping the project updated.

I haven't been involved with google-coral for a while now, however I just regained access to maintain this repo and I'd like to commit to keeping it healthy. I'm not sure if I'll be able to commit code changes, however I can help merge changes that are necessary to keep this project healthy and updated.

— Reply to this email directly, view it on GitHub https://github.com/google-coral/libedgetpu/pull/60#issuecomment-1971929570, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIY7ZOTYIKWWRXGL3HPKLYV6JO7AVCNFSM6AAAAABCTNMRJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZRHEZDSNJXGA. You are receiving this because you were mentioned.Message ID: @.***>

--------------Gwrl20FRUj1VoYkm8hpqWRso Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

Thanks again, @namburger, this is great news.

I am happy to help. Ideally some code changes are needed (in the PR) to make the library viable going forward. I am sure you can cherry pick commits from the PR to get there.

On 2/29/24 3:45 PM, Nam Vu wrote:

thank you for your contribution and keeping the project updated.

I haven't been involved with google-coral for a while now, however I just regained access to maintain this repo and I'd like to commit to keeping it healthy. I'm not sure if I'll be able to commit code changes, however I can help merge changes that are necessary to keep this project healthy and updated.


Reply to this email directly,
view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: <google-coral/libedgetpu/pull/60/c1971929570@github.com>


--------------Gwrl20FRUj1VoYkm8hpqWRso--

feranick commented 5 months ago

Absolutely. I just reverted back to TF 2.15.0 in the last commit.

On 2/29/24 3:48 PM, Nam Vu wrote:

@Namburger <https://github.com/Namburger> Thank you. The fork with
this PR may have some redundancies, but I am happy to align it
with Google guidelines. Just let me know.

BTW, as you also noticed, the current head is built against the
upcoming TF 2.16.0 (currently in rc0). I will keep updating it
until it reaches stable, yet you may find a libedgetpu "stable"
release in here:

https://github.com/feranick/libedgetpu/releases/tag/v16.0-TF2.15.0-1

@feranick https://github.com/feranick Do you think it make sense for this PR to be on a stable release and later you can submit another PR once we have a stable TF 2.16.0?

— Reply to this email directly, view it on GitHub https://github.com/google-coral/libedgetpu/pull/60#issuecomment-1971934031, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIY75Q4ISECHAXSUQDWVDYV6J3DAVCNFSM6AAAAABCTNMRJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZRHEZTIMBTGE. You are receiving this because you were mentioned.Message ID: @.***>

--------------TgS00A6MmYmDkS5QEFmWVzWl Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

Absolutely. I just reverted back to TF 2.15.0 in the last commit.

On 2/29/24 3:48 PM, Nam Vu wrote:

Thank you. The fork with this PR may have some redundancies, but I am happy to align it with Google guidelines. Just let me know.

BTW, as you also noticed, the current head is built against the upcoming TF 2.16.0 (currently in rc0). I will keep updating it until it reaches stable, yet you may find a libedgetpu "stable" release in here:

https://github.com/feranick/libedgetpu/releases/tag/v16.0-TF2.15.0-1

Do you think it make sense for this PR to be on a stable release and later you can submit another PR once we have a stable TF 2.16.0?


Reply to this email directly,
view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: <google-coral/libedgetpu/pull/60/c1971934031@github.com>


--------------TgS00A6MmYmDkS5QEFmWVzWl--

feranick commented 5 months ago

The PR should now be in the stable release as per commit:

https://github.com/feranick/libedgetpu/commit/d90ba0125d28b5288a14eb67b1ebaeeaeae5b37a

On 2/29/24 3:48 PM, Nam Vu wrote:

@Namburger <https://github.com/Namburger> Thank you. The fork with
this PR may have some redundancies, but I am happy to align it
with Google guidelines. Just let me know.

BTW, as you also noticed, the current head is built against the
upcoming TF 2.16.0 (currently in rc0). I will keep updating it
until it reaches stable, yet you may find a libedgetpu "stable"
release in here:

https://github.com/feranick/libedgetpu/releases/tag/v16.0-TF2.15.0-1

@feranick https://github.com/feranick Do you think it make sense for this PR to be on a stable release and later you can submit another PR once we have a stable TF 2.16.0?

— Reply to this email directly, view it on GitHub https://github.com/google-coral/libedgetpu/pull/60#issuecomment-1971934031, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIY75Q4ISECHAXSUQDWVDYV6J3DAVCNFSM6AAAAABCTNMRJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZRHEZTIMBTGE. You are receiving this because you were mentioned.Message ID: @.***>

--------------vLrwj5qd3UAOZyER8PRUjCWY Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

The PR should now be in the stable release as per commit:

https://github.com/feranick/libedgetpu/commit/d90ba0125d28b5288a14eb67b1ebaeeaeae5b37a

On 2/29/24 3:48 PM, Nam Vu wrote:

Thank you. The fork with this PR may have some redundancies, but I am happy to align it with Google guidelines. Just let me know.

BTW, as you also noticed, the current head is built against the upcoming TF 2.16.0 (currently in rc0). I will keep updating it until it reaches stable, yet you may find a libedgetpu "stable" release in here:

https://github.com/feranick/libedgetpu/releases/tag/v16.0-TF2.15.0-1

Do you think it make sense for this PR to be on a stable release and later you can submit another PR once we have a stable TF 2.16.0?


Reply to this email directly,
view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: <google-coral/libedgetpu/pull/60/c1971934031@github.com>


--------------vLrwj5qd3UAOZyER8PRUjCWY--

Namburger commented 5 months ago

@feranick - thanks :1st_place_medal:

Alan01252 commented 5 months ago

You are a superstar @feranick!

Thanks @Namburger too! As a rugged networks employee this is so greatly appreciated.

feranick commented 5 months ago

This is awesome, thanks, @namburger!

On 2/29/24 3:56 PM, Nam Vu wrote:

Merged #60 https://github.com/google-coral/libedgetpu/pull/60 into master.

— Reply to this email directly, view it on GitHub https://github.com/google-coral/libedgetpu/pull/60#event-11971401038, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIY7YRAUOCIBTQII7NSIDYV6KXLAVCNFSM6AAAAABCTNMRJ2VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRHE3TCNBQGEYDGOA. You are receiving this because you were mentioned.Message ID: @.***>

--------------GWqrVqHb8HjjQmCgVjOrSwWG Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

This is awesome, thanks, @namburger!

On 2/29/24 3:56 PM, Nam Vu wrote:

Merged #60 into master.


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: <google-coral/libedgetpu/pull/60/issue_event/11971401038@github.com>


--------------GWqrVqHb8HjjQmCgVjOrSwWG--

feranick commented 5 months ago

@namburger, Google used to provide compiled binaries for this lib. While I made several for Linux/OS, I would hope that these will be prepared to be delivered through:

https://packages.cloud.google.com/apt

as the official channel (and how most people get theirs, along with Windows, etc). Is there any plan for it?

feranick commented 5 months ago

@feranick I have recently written a guide on how to get the usb accelerator working with ultralytics yolov8 based on your updated libs which can be found here. This may be helpful for people trying to get the tpu working.

@Skillnoob In case it's helpful both libcoral and pycoral libraries are now working for me in each of the forked repo. I hope the will be merged after TF2.16.0 reaches stable. Details here: https://github.com/google-coral/pycoral/issues/137#issuecomment-1974048543

feranick commented 5 months ago

(PSA: updated libcoral and pycoral libraries are coming). Details here:

https://github.com/google-coral/pycoral/issues/137#issuecomment-1974048543

Namburger commented 5 months ago

@feranick looks like https://github.com/tensorflow/tensorflow/pull/63075 is approved!

feranick commented 5 months ago

That is awesome, @namburger, thanks! Any chance that will be back ported to 2.16.0?On Mar 2, 2024 9:22 PM, Nam Vu @.***> wrote: @feranick looks like tensorflow/tensorflow#63075 is approved!

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

Namburger commented 5 months ago

@feranick I'm not sure what's tensorflow plans is for the next release, and I don't have much visibility/influence, unfortunately. I think in this instance, we should try to get get the PR merged first, then if it won't be ported to a newer release, we can sync this repo on a commit that contains that change, that way we can get all of the coral repos updated asap!

feranick commented 5 months ago

Thanks, @namburger. This sounds fine on my end.

For reference, a recent PR https://github.com/tensorflow/tensorflow/issues/63018 I submitted was pulled by @mihaimaruseac https://github.com/mihaimaruseac for TFlite, and was quickly backported to 2.16.0. Let's hope this may happen for https://github.com/tensorflow/tensorflow/pull/63075 as well.

In any case, I will be here in case you need any action from my side.

It's really awesome, that Coral is slowly getting back on track.

On 3/3/24 2:13 PM, Nam Vu wrote:

@feranick https://github.com/feranick I'm not sure what's tensorflow plans is for the next release, and I don't have much visibility/influence, unfortunately. I think in this instance, we should try to get get the PR merged first, then if it won't be ported to a newer release, we can sync this repo on a commit that contains that change, that way we can get all of the coral repos updated asap!

— Reply to this email directly, view it on GitHub https://github.com/google-coral/libedgetpu/pull/60#issuecomment-1975268282, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIY77GAF5IZ633XT7VBATYWNY5ZAVCNFSM6AAAAABCTNMRJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZVGI3DQMRYGI. You are receiving this because you were mentioned.Message ID: @.***>

--------------kxvHDI6JIQRIRYZoi0X3qRfC Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

Thanks, @namburger. This sounds fine on my end.

For reference, a recent PR I submitted was pulled by for TFlite, and was quickly backported to 2.16.0. Let's hope this may happen for https://github.com/tensorflow/tensorflow/pull/63075 as well.

In any case, I will be here in case you need any action from my side.

It's really awesome, that Coral is slowly getting back on track.

On 3/3/24 2:13 PM, Nam Vu wrote:

I'm not sure what's tensorflow plans is for the next release, and I don't have much visibility/influence, unfortunately.
I think in this instance, we should try to get get the PR merged first, then if it won't be ported to a newer release, we can sync this repo on a commit that contains that change, that way we can get all of the coral repos updated asap!


Reply to this email directly,
view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: <google-coral/libedgetpu/pull/60/c1975268282@github.com>


--------------kxvHDI6JIQRIRYZoi0X3qRfC--

Namburger commented 5 months ago

@feranick - I've twisted some knobs in our internal codebase to get this commit out, would you like to test now?

feranick commented 5 months ago

@Namburger Will test as soon as I can today. Quick Q: is this available also on the 2.16.0 branch?

feranick commented 5 months ago

The reason I am asking is that I am building libcoral against 2.16.0 (as planned) not 2.17.0 as in the current master.

Namburger commented 5 months ago

@feranick I'm trying to find out, would be nice to get this included as part of 2.16

feranick commented 5 months ago

Thanks. If that doesn't happen, we will have to wait until 2.17, which will be quite down the road...

On 3/5/24 12:11 PM, Nam Vu wrote:

@feranick https://github.com/feranick I'm trying to find out, would be nice to get this included as part of 2.16

— Reply to this email directly, view it on GitHub https://github.com/google-coral/libedgetpu/pull/60#issuecomment-1979250319, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIY7YN7JWPA4OZLIPFTITYWX4DPAVCNFSM6AAAAABCTNMRJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZZGI2TAMZRHE. You are receiving this because you were mentioned.Message ID: @.***>

--------------csmsq23xzmNWXml3up6Ub6QP Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

Thanks. If that doesn't happen, we will have to wait until 2.17, which will be quite down the road...

On 3/5/24 12:11 PM, Nam Vu wrote:

I'm trying to find out, would be nice to get this included as part of 2.16


Reply to this email directly,
view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: <google-coral/libedgetpu/pull/60/c1979250319@github.com>


--------------csmsq23xzmNWXml3up6Ub6QP--

feranick commented 5 months ago

@Namburger So I tested it against the current TF master with both MacOS and Linux and all seems to work. To allow the use of TM master for temporary testing, I locally changed this in the libcoral/WORKSPACE`:

http_archive(
        name = "org_tensorflow",
        urls = [
            "https://github.com/tensorflow/tensorflow/archive/refs/heads/master.tar.gz",
        ],
        sha256 = "21a8363e3272a19977e2f0d12dcb87d1cb61ff0a79d20cfe456d9840e45e18d6",
        strip_prefix = "tensorflow-" + "master",
        )
Namburger commented 5 months ago

@feranick working with the tensorflow repo is a learning experience for me... I'll do a first pass to see how to get this backported to 2.16, will let you know if we'll need a plan b

feranick commented 5 months ago

I'd discuss it with @mihaimaruseac. My previous PR was quickly cherrypicked into 2.16.0 thanks to him.

Namburger commented 5 months ago

@feranick I've also talked to him, doesn't seems like there is a plan to pick this to 2.16 at this time :/

For your change suggestion,

http_archive(
        name = "org_tensorflow",
        urls = [
            "https://github.com/tensorflow/tensorflow/archive/refs/heads/master.tar.gz",
        ],
        sha256 = "21a8363e3272a19977e2f0d12dcb87d1cb61ff0a79d20cfe456d9840e45e18d6",
        strip_prefix = "tensorflow-" + "master",
        )

Could it works to just set TENSORFLOW_COMMIT=79ecb3f8bb6bd73f0115fa9a97b630a6f745a426?

feranick commented 5 months ago

@Namburger That is unfortunate. With regards to the specific commit, that would build against 2.17.0 at this point, correct? Given that is unstable and under development, I wouldn't think it's a good idea......

feranick commented 5 months ago

@Namburger, Compilation of libcoral works fine with MacOS and Linux by building against TF commit 79ecb3f. I updated the PR to reflect that and now all is working.

The concerns remains, though, that it's building against an unstable version fo 2.17.0. Also, should be libedgetpu be built against the same TF (stable is now against 2.15.0, but I am prepping the next with TF 2.16 as soon as it reaches stable).

Namburger commented 5 months ago

@feranick I hear the concerns, it doesn't look like we can get that in for 2.16 so unless we want to sync to a 2.17 dev version, we're stuck in limbo until it comes out.. which probably won't happen any time soon... It's very unfortunate that ultimately what blocks us is some visibility rules by bazel..

At this point my suggestion to move forward and keep all the main rocal repos updated is to sync to 79ecb3f8bb6bd73f0115fa9a97b630a6f745a426, we can monitor for breakage and apply updates as 2.17 is being developed. WDYT?

feranick commented 5 months ago

This sounds good to me @Namburger. The repos are already locally in sync with that commit.

I already found a possible breakage due to 2.17 (See log here) when crosscompiling for arm (there are no issues in MacOS and Linux for x86). I forked 2.16.0 and added the visibility patch to test whether the issue is in 2.17 or not. Will report back.

feranick commented 5 months ago

@namburger since this PR has been merged and concerns libedgetpu (not libcoral), I am going to continue this reporting to the relevant PR:

https://github.com/google-coral/libcoral/pull/36

Namburger commented 5 months ago

This sounds good to me @Namburger. The repos are already locally in sync with that commit.

I already found a possible breakage due to 2.17 (See log here) when crosscompiling for arm (there are no issues in MacOS and Linux for x86). I forked 2.16.0 and added the visibility patch to test whether the issue is in 2.17 or not. Will report back.

Interesting, doesn't look like neon_fully_connected_arm32.cc have been changed for 7 months, that seems to indicates the compiler doesn't like the asm..

mihaimaruseac commented 5 months ago

So, unfortunately this landed too late in the 2.16 release process (it was supposed to also have an RC1 to handle bugs from RC0 but the release team decided otherwise).

If you pin to a commit from master branch, you will always build at that commit, but will need to always test when you move to a separate one. Regarding the commit to pick, I would suggest either the commit where the support got added or one of the nightly commits afterwards (check if there is a pip wheel built)

feranick commented 5 months ago

Well, as unfortunate as that is, I am really grateful to you both @mihaimaruseac and @Namburger to try to push this forward. I am still running a few build tests, it's pretty easy to set either version we want to build against.

If I were to build against the latest nightly, how do I find its corresponding commit?

mihaimaruseac commented 5 months ago

If you build on the day immediately after the nightly release, you can use the commit at the top of the nightly branch

If you build some days later, tf.version should provide the information (I don't recall the exact API, left the actual TF team around 2 years ago and now I'm just consulting from time to time). Alternatively, you can look at the matching GH Action run and take the commit where the action ran at. That would be 019e960 in this screenshot: Screenshot from 2024-03-05 14-00-07 (which corresponds to https://github.com/tensorflow/tensorflow/actions/runs/8150842615)

feranick commented 5 months ago

Interesting, doesn't look like neon_fully_connected_arm32.cc have been changed for 7 months, that seems to indicates the compiler doesn't like the asm..

The issue is only specific to arm7a. This is the commit where edits are exactly what the compiler is complaining about. There are no specific details on what drove the commit in first place....

https://github.com/tensorflow/tensorflow/commit/8419c70e2e14b79ad2c514835bf19830050c528e

Namburger commented 5 months ago

Interesting, doesn't look like neon_fully_connected_arm32.cc have been changed for 7 months, that seems to indicates the compiler doesn't like the asm..

The issue is only specific to arm7a. This is the commit where edits are exactly what the compiler is complaining about. There are no specific details on what drove the commit in first place....

tensorflow/tensorflow@8419c70

humm, that certainly sounds like the issue. As I understand it, that commit changes the constraint of the registers used for that ops which matches the error:

external/org_tensorflow/tensorflow/lite/kernels/internal/optimized/4bit/neon_fully_connected_arm32.cc: In function 'void tflite::optimized_4bit::NeonRunKernelNoSDot(const uint8_t*, const int8_t*, int32_t*, int, int, int, int, int, int) [with int RowsLeft = 4; int RowsRight = 1; int Cols = 32]':
external/org_tensorflow/tensorflow/lite/kernels/internal/optimized/4bit/neon_fully_connected_arm32.cc:192:7: error: 'asm' operand has impossible constraints
  192 |       asm volatile(KERNEL_4x1
      |       
feranick commented 5 months ago

Indeed. Weirdly enough, the change in that commit was only applied to

neon_fully_connected_arm32.cc

not on

neon_fully_connected_aarch64_sdot.cc

which in fact works fine....