Closed albertplus007 closed 4 years ago
Dear Albert, thanks for your interest. Please use read.neurons
to read all of your neurons in one go into a neuronlist
object. Then use dotprops
to convert into the preferred format for NBLAST. See e.g. https://groups.google.com/forum/#!topic/nat-user/_4d1XCtbM90 for an example. Best, Greg.
thanks, i have an another question, the function we use is dotprops(skel, resample = 1, k = 5)
the resample = 1
, but the resolution of my data is 4x4x40nm
, should i also use the resample = 1
?
then i use resample = 100
to create the preferred format, cause the resample = 1
will use lots of memory.
Question 2:
i random select a neuron for nblast in my own data, then scores = nblast(skel_dps, skel_all_dps)
but the scores are very weird.
except the query neurons (the first line) the others scores are totally negative, what happened to this?
then i change resample = 10
, the same thing happened again, except the query neuron, the others are all negative.
The solution to both of these issues is that you need to concert your neurons from units of nl to microns before calculating the “dotprops” objects. You can just divide The neuronlist generated by read.neurons by 1000.
thanks a lot, i have another question, all the flycircuits neuron data were in dps
, which is a large neuronlistfh, the i change the neuronlistfh to neuronlist named dps_neuronlist
, but when i run the following code to cluster all neurons, it costs a lots of time, and it does not finished yet
scores = nblast_allbyall(dps_neuronlist)
the neuronlistfh was download buy the paper NBLAST's code
i don't know why it costs such a long time, does it not suitable for large data (there are 16129 neurons in the neuronlist)
See bottom of this page (linked from the nblast website)
https://gist.github.com/jefferis/bbaf5d53353b3944c090
It is recommended to use remotesync
to download all the flycircuit neurons.
Thanks a lot.
I used it, and all flycircuit neurons are in my local memory, and i chage the neuronlistfh to neuronlist
but a run scores = nblast_allbyall(dps_neuronlist)
, it still costs a lots of time, and it does not finished yet, i don't know what caused it?
I just want to cluster all flycircuit neurons, it seems to have a big impact on big data
If you want to do this:
scores = nblast_allbyall(dps_neuronlist)
Although NBLAST is fast by the standards of neuron comparison tools, typically ~2ms per comparison for the flycircuit dataset, this will indeed take a lot of time. There are 16129 neurons in dps_neuronlist
. That means that there are 16129^2=260,144,641 comparisons to make. At 2 ms per comparison that would be 16129^2*2/1000/3600/24 = 6 days.
In fact you can make an estimate on your own machine like this by choosing a sample of 1% of the neurons:
set.seed(42)
dpsample=sample(dps, round(length(dps)/100))
system.time(nblast_allbyall(dpsample, .progress='text'))
For me on my laptop this took 70s. The whole dataset will take 100^2 times longer i.e. 69.952*100^2/3600/24=8.09 days.
It is possible to speed this up by running comparisons in parallel. You can do this as follows:
# 4 parallel jobs
doMC::registerDoMC(cores = 4)
system.time(nblast_allbyall(dpsample, .parallel=T))
Note that it is worth testing different numbers of cores as the speedup will not be linear across the full range of cores. It will also depend on how long the whole jobs take as there is a fixed cost associated with starting each parallel job. Be careful also as using more cores will use more memory and eventually may exhaust the memory on your machine and make everything much slower. For me, 4 cores did give a 4x speedup so therefore 2d for the whole dataset. Note that it was a bit quicker in the command line version of R than inside Rstudio.
Thanks a lot, i got it.
I still have a question, when we got the scores
of the flycircuit dataset, then we want to cluster them, but the function nhclust
can only process 4000
neurons, so, should i just seletct 4000 neurons, for each cluster? such as hcc = nhclust(files[1:4000], scoremat = scores)
. does it the same as cluster all them? sucha as hcc = nhclust(scoremat = scores)
(if there is no limit for maxneurons
)?
You can increase the maximum number (it was left as a sort of safety threshold). See the maxneurons
argument of http://natverse.org/nat.nblast/reference/nhclust.html.
Thanks a lot. Sorry to disturb you, I asked a stupid question and solved it after I submitted the issue
Hi, nice library
recently, i use the nblast to search the similar neurons which were produced by myself, and the neuron type was save as
.swc
, i read the neuron as the following code:and i transform the neuron to neuronlist
and then nblast
but it gives me the following:
so, i just wondering is the neuronlist i created not correct? what should i do?
then i transform the neuron to neuronlist
but it gives me the following: