As part of data validation pipeline :
I retrieve the expected tickers list from API endpoint : /v3/reference/tickers
and use it to validate that S3 csv files contains expected tickers
(for tensorflow OPS)
INPUT : tensor.constant with dtype string of the tickers csv column
OUTPUT : tf.strings.regex_replace(INPUT, r"-", "")
so the csv data match the expected ticker.
Screenshots
Not needed
Desktop (please complete the following information):
OS: ubuntu 24 LTS
Browser Firefox
Version [e.g. 22]
Smartphone (please complete the following information):
None
Additional context
Add any other context about the problem here.
Describe the bug I found a mismatch in documentation and S3 csv data for FOREX and CRYPTO endpoint
To Reproduce Don't need any code for find the issue
Step 1 : Run the download web query for get any minute bar flat files for FOREX or CRYPTO Step 2 : Unzip and read it
Expected behavior Following the documentation at https://polygon.io/flat-files/crypto-min-aggs/2023/12?crypto-min-aggs=documentation The expected ticker structure is X:BTCUSD While in S3 csv data the ticker structure is X:BTC-USD thus the - is not expected
print of tf tensor :
<tf.Tensor: shape=(11,), dtype=string, numpy= array([b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD', b'X:BTC-USD'], dtype=object)>
Temporary fix (for me):
As part of data validation pipeline : I retrieve the expected tickers list from API endpoint : /v3/reference/tickers and use it to validate that S3 csv files contains expected tickers
(for tensorflow OPS) INPUT : tensor.constant with dtype string of the tickers csv column OUTPUT : tf.strings.regex_replace(INPUT, r"-", "")
so the csv data match the expected ticker.
Screenshots Not needed
Desktop (please complete the following information):
Smartphone (please complete the following information): None
Additional context Add any other context about the problem here.