eprbell / dali-rp2

DaLI (Data Loader Interface) is a data loader and input generator for RP2 (https://pypi.org/project/rp2), the privacy-focused, free, open-source cryptocurrency tax calculator: DaLI removes the need to manually prepare RP2 input files. Just like RP2, DaLI is also free, open-source and it prioritizes user privacy.
https://pypi.org/project/dali-rp2/
Apache License 2.0
62 stars 41 forks source link

Kraken No Longer Offers Individual CSV Files #217

Closed macanudo527 closed 1 week ago

macanudo527 commented 6 months ago

For some inexplicable reason, Kraken has decided to no longer spend the extra few seconds it takes to unzip the massive CSV file that includes all the historical data into a separate Google folder.

This means that in order to access the CSV data, you have to download the file that includes all OHLCVT data from Kraken, which is currently around 3.8gigs.

I'm not sure what to do here. One option is to download the whole file, unzip only the files needed, chunk them and then delete the main file. I guess that is the only approach if we want to keep using the data. It requires the temporary use of 4gigs or so, and a lot of bandwidth, but maybe we can warn users about it?

eprbell commented 6 months ago

Yeah, unfortunately we have dependencies on some fragile things (like this one and others like REST APIs: CB just deprecated a couple of endpoints we were using and broke the CB plugin). In this case I think it's OK to download the whole file and process its contents, as long as we let users know about what is going on with a message.

macanudo527 commented 6 months ago

Do you think we should prompt to download and then prompt to delete after unpacking what is needed?

They do have quarterly updates that could be downloaded later once the main bulk file has been downloaded.

eprbell commented 6 months ago

Sure, that sounds good: let's make it clear to the users that they're about to download gigs of data and let's also prompt them to delete it after we're done. The presence of quarterly updates reduces the pain a bit.

macanudo527 commented 6 months ago

I'm going to write a bash script that backs up the .dali_cache to .dali_cache_bak, and another script that restores from .dali_cache_bak to .dali_cache. And probably a script that runs pytest between backing up and restoring just for good measure. I don't want to accidently delete the Kraken cache and have to re-download the entire file.

Should I commit these scripts? under a new directory /scripts?

eprbell commented 6 months ago

Sounds good, just a couple of questions/comments:

macanudo527 commented 6 months ago

Whoops. I've doing too much ROR development. It's UNIX only and usually it is more efficient to create a script.

Yeah, in Python is good and in bin. I'll whip it up and try to get a barebones Kraken CSV plugin out here soon.

macanudo527 commented 1 week ago

This was fixed a while ago and #225 fleshed it out well enough, so closing for now.