Closed choosehappy closed 4 years ago
will try to look at this tonight. I don't think I've fully tested some of the unsupervised methods with data loaded through the Load_Data function. Will hopefully have a fix soon.
great, thanks! happy to help debug if useful
thanks, I definitely appreciate the help especially since the repository is under active development still and likely still has bugs that need ironing out.
not a problem, i certainly know how it is : )
On Thu, Nov 14, 2019 at 10:14 PM John-William Sidhom < notifications@github.com> wrote:
thanks, I definitely appreciate the help especially since the repository is under active development still and likely still has bugs that need ironing out.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sidhomj/DeepTCR/issues/20?email_source=notifications&email_token=ACJ3XTGOZPVVFOCS2Z2LWDTQTW5RZA5CNFSM4JNPYYI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEDJ5UA#issuecomment-554082000, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJ3XTCG7XHJ62OIJ77BJF3QTW5RZANCNFSM4JNPYYIQ .
I just pushed up a fix I think. Let me know if it works on your data!
Unforunately it didn't appear to fix it, same error
is there perhaps some other debugging information I can provide?
I would check to make sure that your sample labels and class labels are string types and not integers or floats.
hmmm...they look good to me, appear to be all strings:
can you send me a sampling of this csv file you're trying to run to see if I can replicate the error? jsidhom1@jhmi.edu
1 step ahead of you : ) already getting in contact with the data-owner to ensure no confidentiality issues. i suspect we'll be ok. will send afterward, likely in the next few hours
all good. just sent it
figured it out. the problem is you're passing lists to the Load_Data where in the docs, it says you need to pass numpy arrays.. if you put a np.array() around your inputs, the code should work.
yes! that totally did it, thanks
i had tried to mirror the "1-loading data" tutorial, but realize now when looking at the data quickly, a numpy lists and python lists appear similar at fast glance. my mistake, sorry about that
you may want to add in a note in that particular file, and some type checking in the loading function to help others
in particular, its quite unexpected that the first 2 functions work okay (loading + training), which i perceived as the "important" functions, but the 3rd one (clustering) doesn't, i think thats what threw me off. i would say more commonly if the data isn't in the right type of format the first or second command immediately fail with a more obvious error message
anyway, very minor comments, thanks for all your help!
On Fri, Nov 15, 2019 at 4:45 PM John-William Sidhom < notifications@github.com> wrote:
figured it out. the problem is you're passing lists to the Load_Data where in the docs, it says you need to pass numpy arrays
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sidhomj/DeepTCR/issues/20?email_source=notifications&email_token=ACJ3XTG7ZNHZL5RPSLPA63DQT272ZA5CNFSM4JNPYYI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEF2OFA#issuecomment-554411796, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJ3XTBND4XP7KFUAKGBZ7DQT272ZANCNFSM4JNPYYIQ .
for sure! I added some data type checking into the Load_Data function to make sure inputs are numpy arrays, thanks!
beautiful, and thanks again for the help!
On Sat, Nov 16, 2019 at 2:00 PM John-William Sidhom < notifications@github.com> wrote:
for sure! I added some data type checking into the Load_Data function to make sure inputs are numpy arrays, thanks!
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/sidhomj/DeepTCR/issues/20?email_source=notifications&email_token=ACJ3XTBKPMM5RVQAXHWYCY3QT7VHZA5CNFSM4JNPYYI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHQ77Q#issuecomment-554635262, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJ3XTFKLXECFR3Q5UC5ZELQT7VHZANCNFSM4JNPYYIQ .
Getting an error when using our data, after loading with:
Training appears to have gone ok :
But clustering appears to fail:
same error when using phenograph method, so isn't clustering approach specific. Also happens when randomly sampling:
Is it possible that there are some outliers produced by the clustering methods, causing "sel" to be not an integer? or perhaps there is some meta data i need to set?
Other functions appear to work okay:
i see #1 which has a similar error, but my data exists as a single csv file which i'm loading via pandas and chopping the necessary columns out of. as such, loading via directory doesn't appear to be an option
any ideas?