Closed YuntaoTan closed 5 years ago
Hello, can you send me the out.log file? the min_copy_number parameter is the minimun number of elements a family should contain to remain valid. If it is a small genome I'd suggest 2 or 3.
hi, @juancrescente , thanks for your reply, following is my out.log.
2019-05-06 17:25:27,801 Clustering
2019-05-06 17:25:27,801 /export/personal1/tanyt/Pipeline/Repeat/pipeline3/MITE/MITE-Tracker/vsearch-2.7.1/bin/vsearch --cluster_fast results/atest/candidates.fasta --threads 51 --strand both --clusters results/atest/temp/clust --iddef 1 --id 0.88
2019-05-06 18:46:02,392 Clustering done
2019-05-06 18:46:02,393 Filtering clusters
2019-05-06 18:46:15,366 Initial clusters: 240464
2019-05-06 21:44:43,660 Clusters: 0
2019-05-06 21:44:44,340 15609.018525 secs
2019-05-07 09:33:06,034 Clustering
2019-05-07 09:33:06,035 /export/personal1/tanyt/Pipeline/Repeat/pipeline3/MITE/MITE-Tracker/vsearch-2.7.1/bin/vsearch --cluster_fast results/atest/candidates.fasta --threads 51 --strand both --clusters results/atest/temp/clust --iddef 1 --id 0.88
2019-05-07 15:27:15,842 Clustering done
2019-05-07 15:27:15,843 Filtering clusters
2019-05-07 15:27:29,181 Initial clusters: 240464
2019-05-07 21:10:57,389 Clusters: 0
2019-05-07 21:10:59,967 41924.152439 secs
there is 0 Cluster, 2019-05-07 21:10:57,389 Clusters: 0
. I also try to set --min_copy_number
to 2, there is nothing found, my species is plant, I use MITE-Hunter, that can find many MITEs.
Another issue maybe, the VSEARCH will write so many small file, in my case, initial clusters is 240464
, so 240464
files i got, it will make the IO very busy and slow. should you consider other cluster method tools like cd-hit.
we do not use cd-hit because of execution time. Seems like the clusters are not similar to each other. Maybe you can send me your sequences and I can take a look? write me to juan.crescente at gmail.com if you want
hi, I try to use MITE-Tracker,it run faster than MITE-Hunter. I split my genome which contain 738 Contigs ~300Mbp into 38 cuts file, I can find the candidate MITE in each cut, but can not find the final result. like following:
there is nothing in
can you help me? is that the key of the parameter
--min_copy_number
? I set it to 4, like you. is that means tetraploid in your test wheat. my species is diploid , should i change it to 2 ?