Open CholoTook opened 3 years ago
Hi, let me look at it. There is probably something strange in the GPL file. Maybe editing - the file would do the trick. Assuming this is only one timer this could be a good solution. Anyway, taking look at the GPL file would shed some light on what is really the reason.
Could it be that the chromosome column starts out as an int, and then becomes a str?
!platform_table_begin
ID CHROMOSOME Position SNP Plus/Minus Strand CanineHD_A.bpm.Address SPOT_ID SNP_ID
BICF2G630100019 25 34549096 [A/G] BOT 25732300 BICF2G630100019
BICF2G630100032 25 34560607 [A/G] BOT 18759386 BICF2G630100032
BICF2G630100034 25 34561954 [A/G] BOT 13789354 BICF2G630100034
BICF2G630100043 25 34587072 [A/G] BOT 32780356 BICF2G630100043
BICF2G630100054 25 34604596 [T/C] BOT 21757302 BICF2G630100054
BICF2G630100063 25 34615165 [A/G] BOT 51809461 BICF2G630100063
BICF2G630100075 25 34638645 [A/C] BOT 55806509 BICF2G630100075
BICF2G63010009 X 95382735 [T/C] BOT 41613463 BICF2G63010009
BICF2G630100090 25 34688200 [T/C] BOT 51730475 BICF2G630100090
BICF2G630100094 25 34689509 [A/T] BOT 53724487 BICF2G630100094
BICF2G63010010 X 95373856 [A/G] BOT 49675468 BICF2G63010010
Pandas may guess that it's an int and then get confused... As I said, I'm not super familiar with pandas, but I suppose there is a way to let it know the datatype of each column. However, I don't know how GEOparse invokes Pandas.
Indeed, this seems that this is a problem. Currently, the package does not allow to pass kwargs to Pandas. However, if the code is in some script and it influences the behaviour you could convert the type after the data is read.
Seems not to cause any problem TBH. It's just a bit of a weird looking error..
You could probably get away with the low_memory=False
flag by default?
Thanks for help, Dan.
The following code is generating a warning for me:
The output is:
I get that this error is coming from pandas, but I'm not sure how to fix it.