Closed jason-brian-anderson closed 3 years ago
@jason-brian-anderson Just to confirm, you run the same pipeline (with ImpoterNode) twice and found it not reusing cache? If so could you adjust the logging level to DEBUG and see whether the extra logs can bring us more info?
Sorry I'm just now getting back to this. It's not immediately obvious - at least to me - how to adjust logging level to debug in TF2. Will look more for this so i can provide the debug info.
Please find the code below where Logging Level is set to 10 i.e., DEBUG, meaning only all the Logs should be Printed:
import logging
import tensorflow as tf
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.DEBUG)
tf.compat.v1.logging.error('Error Message')
tf.compat.v1.logging.info('Info Message')
tf.compat.v1.logging.warning('Warning Message')
As shown below, it has printed all the Log Messages:
ERROR:tensorflow:Error Message
INFO:tensorflow:Info Message
WARNING:tensorflow:Warning Message
2) set the Logging Level to 30, i.e., WARN
import logging
import tensorflow as tf
tf.compat.v1.logging.set_verbosity(30) # WARN
tf.compat.v1.logging.error('Error Message')
tf.compat.v1.logging.info('Info Message')
tf.compat.v1.logging.warning('Warning Message')
As shown below, Info Logs will be Filtered and Warning and Error Logs will be printed:
ERROR:tensorflow:Error Message
WARNING:tensorflow:Warning Message
Let's Set the Logging Level to 40 i.e., ERROR
import logging
import tensorflow as tf
tf.compat.v1.logging.set_verbosity(40) # ERROR
tf.compat.v1.logging.error('Error Message')
tf.compat.v1.logging.info('Info Message')
tf.compat.v1.logging.warning('Warning Message')
Now, we can see that only Error Message is printed:
ERROR:tensorflow:Error Message
Hope this helps
Closing this as it has been inactive with awaiting response for some time.Please feel free to reopen.
It appears that the
enable_cache
interface totfx.componenets.Transform
is not effective when. usingtfx.components.ImportNode
to load a curated schema.transform looks like:
The transformer is clearly processing through examples as it can go from 15 minutes to 4 hours depending on the data i feed it. Additionally, I searched the output for 'skip' as was recommended in a previous related issue, nothing found. output of transformer in airflow is: