lpantano / seqcluster

small RNA analysis from NGS data
http://seqcluster.readthedocs.io
MIT License
35 stars 17 forks source link

TypeError: 'map' object is not subscriptable #47

Closed lurebgi closed 4 years ago

lurebgi commented 5 years ago

Hi, I came across this error TypeError: 'map' object is not subscriptable

INFO 1834545 Clusters read INFO Creating meta-clusters based on shared sequences: 61291 99% (61244 of 61291) |################# | Elapsed Time: 7:19:39 ETA: 0:00:00INFO 512 metaclusters from 33960 sequences INFO 512 clusters found INFO counts after: 19759492 INFO # sequences after: 61375 INFO Solving multi-mapping events in the network of clusters INFO Number of loci: 1834545 N/A% (0 of 512) | | Elapsed Time: 0:00:00 ETA: --:--:--Traceback (most recent call last): File "/scratch/luohao/software/seqcluster/bin/seqcluster", line 10, in sys.exit(main()) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/command_line.py", line 28, in main cluster(kwargs["args"]) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/make_clusters.py", line 76, in cluster clusLred = _cleaning(clusL, args.dir_out) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/make_clusters.py", line 294, in _cleaning clus_obj = reduceloci(clusL, path) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/detect/metacluster.py", line 77, in reduceloci _write_cluster(c, clus_obj.clus, clus_obj.loci, n_cluster, path) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/detect/metacluster.py", line 102, in _write_cluster print("\t".join(pos[:4] + [str(len(cluster[idc].loci2seq[idl]))] + [pos[-1]]), file=out_handle, end="") TypeError: 'map' object is not subscriptable 100% (61291 of 61291) |##################| Elapsed Time: 7:20:41 Time: 7:20:41 100% (512 of 512) |######################| Elapsed Time: 0:00:14 Time: 0:00:14

Thanks, Luohao

lpantano commented 5 years ago

HI,

Sorry about the issue. What version of seqcluster are you running?

If I can have the input files I can try to debug. If it is not possible to share the files, I usually add a pdb.trace_back() with a try/except statement to get control of the exact point the issue is happening, and then look at the exact variable is having the issue.

Let me know if you can share with me your files.

Cheers

On July 24, 2019 at 3:30:48 AM, Luohao Xu (notifications@github.com) wrote:

Hi, I came across this error TypeError: 'map' object is not subscriptable

INFO 1834545 Clusters read INFO Creating meta-clusters based on shared sequences: 61291 99% (61244 of 61291) |################# | Elapsed Time: 7:19:39 ETA: 0:00:00INFO 512 metaclusters from 33960 sequences INFO 512 clusters found INFO counts after: 19759492 INFO # sequences after: 61375 INFO Solving multi-mapping events in the network of clusters INFO Number of loci: 1834545 N/A% (0 of 512) | | Elapsed Time: 0:00:00 ETA: --:--:--Traceback (most recent call last): File "/scratch/luohao/software/seqcluster/bin/seqcluster", line 10, in sys.exit(main()) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/command_line.py", line 28, in main cluster(kwargs["args"]) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/make_clusters.py", line 76, in cluster clusLred = _cleaning(clusL, args.dir_out) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/make_clusters.py", line 294, in _cleaning clus_obj = reduceloci(clusL, path) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/detect/metacluster.py", line 77, in reduceloci _write_cluster(c, clus_obj.clus, clus_obj.loci, n_cluster, path) File "/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/detect/metacluster.py", line 102, in _write_cluster print("\t".join(pos[:4] + [str(len(cluster[idc].loci2seq[idl]))] + [pos[-1]]), file=out_handle, end="") TypeError: 'map' object is not subscriptable 100% (61291 of 61291) |##################| Elapsed Time: 7:20:41 Time: 7:20:41 100% (512 of 512) |######################| Elapsed Time: 0:00:14 Time: 0:00:14

Thanks, Luohao

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/47?email_source=notifications&email_token=AAML6HFEQDOEYPCONIC3HB3QBAAKRA5CNFSM4IGNB7HKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HBEBNEQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML6HGZQVEN72WNIEFNMO3QBAAKRANCNFSM4IGNB7HA .

lurebgi commented 5 years ago

Hi,

I installed it via conda, the version should be 1.2.5

https://drive.google.com/drive/folders/1ck16RgzuG9J5gl8GyHRU_ZH17WwqMaKx?usp=sharing (please request for access if not accessible)

On Wed, Jul 24, 2019 at 3:09 PM Lorena Pantano notifications@github.com wrote:

HI,

Sorry about the issue. What version of seqcluster are you running?

If I can have the input files I can try to debug. If it is not possible to share the files, I usually add a pdb.trace_back() with a try/except statement to get control of the exact point the issue is happening, and then look at the exact variable is having the issue.

Let me know if you can share with me your files.

Cheers

On July 24, 2019 at 3:30:48 AM, Luohao Xu (notifications@github.com) wrote:

Hi, I came across this error TypeError: 'map' object is not subscriptable

INFO 1834545 Clusters read INFO Creating meta-clusters based on shared sequences: 61291 99% (61244 of 61291) |################# | Elapsed Time: 7:19:39 ETA: 0:00:00INFO 512 metaclusters from 33960 sequences INFO 512 clusters found INFO counts after: 19759492 INFO # sequences after: 61375 INFO Solving multi-mapping events in the network of clusters INFO Number of loci: 1834545 N/A% (0 of 512) | | Elapsed Time: 0:00:00 ETA: --:--:--Traceback (most recent call last): File "/scratch/luohao/software/seqcluster/bin/seqcluster", line 10, in sys.exit(main()) File

"/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/command_line.py", line 28, in main cluster(kwargs["args"]) File

"/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/make_clusters.py", line 76, in cluster clusLred = _cleaning(clusL, args.dir_out) File

"/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/make_clusters.py", line 294, in _cleaning clus_obj = reduceloci(clusL, path) File

"/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/detect/metacluster.py", line 77, in reduceloci _write_cluster(c, clus_obj.clus, clus_obj.loci, n_cluster, path) File

"/scratch/luohao/software/seqcluster/lib/python3.6/site-packages/seqcluster/detect/metacluster.py", line 102, in _write_cluster print("\t".join(pos[:4] + [str(len(cluster[idc].loci2seq[idl]))] + [pos[-1]]), file=out_handle, end="") TypeError: 'map' object is not subscriptable 100% (61291 of 61291) |##################| Elapsed Time: 7:20:41 Time: 7:20:41 100% (512 of 512) |######################| Elapsed Time: 0:00:14 Time: 0:00:14

Thanks, Luohao

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/lpantano/seqcluster/issues/47?email_source=notifications&email_token=AAML6HFEQDOEYPCONIC3HB3QBAAKRA5CNFSM4IGNB7HKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HBEBNEQ

, or mute the thread < https://github.com/notifications/unsubscribe-auth/AAML6HGZQVEN72WNIEFNMO3QBAAKRANCNFSM4IGNB7HA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/47?email_source=notifications&email_token=ABHDX4O2UWPNDNLVDGZKIBTQBBH7LA5CNFSM4IGNB7HKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2WI2UY#issuecomment-514624851, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHDX4IQBIBL3BLHK6S6O3DQBBH7LANCNFSM4IGNB7HA .

lpantano commented 5 years ago

thank you, I am working on this issue, although I see for the numbers that the sequences sparsed a lot over the genome and there are a big number of loci compared to sequences. That normally indicates is not a typical small RNA data.

Can I know what kit was used to produce this data, or the size distribution of the reads? is there a peak at any size?

Due to this, the debugging time can be larger because it is taking a lot of time to go through all those loci. But I will try to find the issue.

Thank you

lurebgi commented 5 years ago

Hi,

I am re-analyzing the data from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4855716/

The data should be fine. Maybe rRNAs and tRNAs need to be filtered first?

L

On Fri, Jul 26, 2019 at 8:00 PM Lorena Pantano notifications@github.com wrote:

thank you, I am working on this issue, although I see for the numbers that the sequences sparsed a lot over the genome and there are a big number of loci compared to sequences. That normally indicates is not a typical small RNA data.

Can I know what kit was used to produce this data, or the size distribution of the reads? is there a peak at any size?

Due to this, the debugging time can be larger because it is taking a lot of time to go through all those loci. But I will try to find the issue.

Thank you

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/47?email_source=notifications&email_token=ABHDX4NJJH7MYSYFG7HVLXTQBM3SLA5CNFSM4IGNB7HKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD25JW7I#issuecomment-515545981, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHDX4N4LISA7G2PA6NXXTTQBM3SLANCNFSM4IGNB7HA .

lpantano commented 5 years ago

Thanks for the link.

Meanwhile I am debugging this, do you think you can try to remove the scaffold before mapping and reduce -m option to 100-200. Just want to check if that makes the data less duplicative.

Thanks

lurebgi commented 5 years ago

Please find the file 'seqs.m100.sort.bam' in the same drive folder.

On Fri, Jul 26, 2019 at 9:16 PM Lorena Pantano notifications@github.com wrote:

Thanks for the link.

Meanwhile I am debugging this, do you think you can try to remove the scaffold before mapping and reduce -m option to 100-200. Just want to check if that makes the data less duplicative.

Thanks

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/47?email_source=notifications&email_token=ABHDX4NH3K7RILDV5NENA5TQBNEQ3A5CNFSM4IGNB7HKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD25PKZA#issuecomment-515568996, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHDX4NFEFPUABBYHPBXPH3QBNEQ3ANCNFSM4IGNB7HA .

lpantano commented 5 years ago

Hi,

did you try to run that file to see if it breaks as well. It is working with that file for me, just want to confirm is not a user-specific error.

Thanks

lpantano commented 4 years ago

This is now fixed, version 1.2.7 should work now.