allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.65k stars 652 forks source link

Clearml does not work with tensorflow 2.5,2.6 #429

Open nirschipper opened 3 years ago

nirschipper commented 3 years ago

I upgraded my tensorflow to 2.6 (and then downgraded to 2.5). Importing tensorflow and then trying to initialize a task yield the following error AttributeError: module 'tensorflow.python.training.tracking.tracking' has no attribute 'cached_per_instance'. I downgraded to 2.1 and things are working fine now so I solved my problem but it's still a bug.

JDennisJ commented 3 years ago

Hi @nirschipper ,

Tried to reproduce the issue but had no luck, I used TF 2.5 and 2.6 and all works for me, can you try running one of the examples here and update if you get the failure?

Do you have a toy example (with requirements) I can run to reproduce the issue? which clearml version do you use? Can you attach the full error trace?

bmartinn commented 3 years ago

Hi @nirschipper Could this be related https://github.com/tensorflow/models/issues/9376

nirschipper commented 3 years ago

I get the same error (missing attribute). I ended up downgrading my tf version so I didn't go too deep into the issue

On Fri, Aug 20, 2021 at 1:07 AM Martin.B @.***> wrote:

Hi @nirschipper https://github.com/nirschipper Could this be related tensorflow/models#9376 https://github.com/tensorflow/models/issues/9376

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/allegroai/clearml/issues/429#issuecomment-902282244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2N6PHSEMMX5TI7X4ILJP3T5V6BRANCNFSM5CL2ROFQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

The information included in this e-mail message is confidential, and any and all rights title and interest thereto are property of Cipia Vision   Ltd. (Formerly Eyesight Technologies Ltd.). Any modification, copy, distribution or other use of the information included in the e-mail message is prohibited, unless expressly authorized in a written agreement with Cipia Vision  Ltd. and strictly for the purpose stipulated in such agreement; without derogation of the above, any changes, modifications, or any other derivative work of the information contained herein is the sole and exclusive property of Cipia Vision  Ltd.  If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify  us immediately at @. @.>  *and delete all copies of this communication. Thank you

nirschipper commented 3 years ago

I used the solution you are suggesting a long time ago and it is still set (The registry key). This error (file too long) happened after I updated my master branch and it added new files (not added by me personally). Since I have the fix in place, long file names don't cause errors for me, so it only came up when I try to run this task. Also, basic tasks (like training) still run fine. Hyperparameter optimization is a compound task so something there works differently (one of the things it does that training tasks don't do is compare against the remote).

On Fri, Aug 20, 2021 at 1:04 AM Martin.B @.***> wrote:

I personally switched that off in my computer (I checked it's still off), so the problem must arise from settings exported from somewhere else.

@nirschipper https://github.com/nirschipper what do you mean by that? Actually the file-too-long error match the original log, but how could that be? I'm assuming you are running both local and remote (agent) on windows machines, how could one create a path that is too long for the other ? Could it be that the addition of "C:/Users/nir.s/.clearml/venvs-builds/3.8/task_repository/" to the full path actually causes the full path to exceed the windows MAX_PATH ? See here:

https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=cmd Is this the fix you were referring to?

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem] "LongPathsEnabled"=dword:00000001

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/allegroai/clearml/issues/427#issuecomment-902280707, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2N6PAJPDCOTD3GDJW5QKLT5V5XJANCNFSM5CLKCVGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

The information included in this e-mail message is confidential, and any and all rights title and interest thereto are property of Cipia Vision   Ltd. (Formerly Eyesight Technologies Ltd.). Any modification, copy, distribution or other use of the information included in the e-mail message is prohibited, unless expressly authorized in a written agreement with Cipia Vision  Ltd. and strictly for the purpose stipulated in such agreement; without derogation of the above, any changes, modifications, or any other derivative work of the information contained herein is the sole and exclusive property of Cipia Vision  Ltd.  If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify  us immediately at @. @.>  *and delete all copies of this communication. Thank you