Closed Coppelian closed 2 years ago
If you're using a Mac, you need to downgrade Python to 3.6 because the tokenizer uses multiprocessing features that don't work in Macs past Python 3.6. Better yet: don't use a Mac.
@crista Thank you for your response! I'll set up an ubuntu system with python 3.6 to see if it works.
@Coppelian if you use Ubuntu, there is no problem -- you can use the latest Python.
@crista Hi, Crista. I create a ubuntu 18.04 LTS with python 3.5 in itself. I still faced the same mistake. What should I do in this situation? And what is your environment when running block-level tokenizer?
Please use Python 3.8 or later on Ubuntu.
I tried it on python3.7 and 3.8 and received the same mistake. I'm using Ubuntu 16.04 LTS with python 3.8 now. You can find my log here. LOG-0.log . I wonder what actually cause this problem since file-level code works perfectly now.
Can you provide your environment details of running block-level tokenizer? I'm stuck here and need a way out. Thank you for your help.
@Coppelian the block-level tokenizer is not supported anymore.
Uh. Thank you for your notification.
Hi @crista . Can SourcererCC report the cloned lines by using function-level tokenizer, or it has to use block-level tokenizer?
Hi @crista . Sorry for at you for several times. I think I got some result using block-level tokenizer. It will generate results if you use python 2.7. It looks like the tokenizer itself is not updated to python3. Thank you for your help.
This is the result using py27:
GO
'zipblocks'format
*** Starting priority projects...
*** Starting regular projects...
Starting new process 0
*** No more projects to process. Waiting for children to finish...
[INFO] (MainThread) Process 0 starting
[INFO] (MainThread) Starting zip project <11,test-env/2Shirt-SpellBurner.zip> (process 0)
[INFO] (MainThread) Attempting to process_zip_ball test-env/2Shirt-SpellBurner.zip
[INFO] (MainThread) Successfully ran process_zip_ball test-env/2Shirt-SpellBurner.zip
[INFO] (MainThread) Project finished <11,test-env/2Shirt-SpellBurner.zip> (process 0)
[INFO] (MainThread) (0): Total: 0:00:00.012767micros | Zip: 0 Read: 0 Separators: 0micros Tokens: 0micros Write: 0micros Hash: 0 regex: 0
[INFO] (MainThread) Starting zip project <12,test-env/2xyo-indicator-ip.zip> (process 0)
[INFO] (MainThread) Attempting to process_zip_ball test-env/2xyo-indicator-ip.zip
[INFO] (MainThread) Attempting to process_file_contents test-env/2xyo-indicator-ip.zip/indicator-ip-master/test.py
[INFO] (MainThread) Starting tokenize_blocks of test-env/2xyo-indicator-ip.zip/indicator-ip-master/test.py
[WARNING] (MainThread) File test-env/2xyo-indicator-ip.zip/indicator-ip-master/test.py cannot be parsed. encoding declaration in Unicode string (<unknown>, line 0)
[INFO] (MainThread) Returning None on tokenize_blocks for file test-env/2xyo-indicator-ip.zip/indicator-ip-master/test.py.
[WARNING] (MainThread) Problems tokenizing file test-env/2xyo-indicator-ip.zip/indicator-ip-master/test.py
[INFO] (MainThread) Successfully ran process_zip_ball test-env/2xyo-indicator-ip.zip
[INFO] (MainThread) Project finished <12,test-env/2xyo-indicator-ip.zip> (process 0)
[INFO] (MainThread) (0): Total: 0:00:00.001362micros | Zip: 90 Read: 51 Separators: 0micros Tokens: 0micros Write: 0micros Hash: 0 regex: 0
[INFO] (MainThread) Starting zip project <13,test-env/3demax-Take-a-break.zip> (process 0)
[INFO] (MainThread) Attempting to process_zip_ball test-env/3demax-Take-a-break.zip
[INFO] (MainThread) Attempting to process_file_contents test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/appmenu.py
[INFO] (MainThread) Starting tokenize_blocks of test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/appmenu.py
[WARNING] (MainThread) Finished step1 on process_file_contents
[WARNING] (MainThread) Finished step2 on process_file_contents
[INFO] (MainThread) Successfully ran process_file_contents test-env/3demax-Take-a-break.zip/test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/appmenu.py
[INFO] (MainThread) Attempting to process_file_contents test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/dynamic_status_icon.py
[INFO] (MainThread) Starting tokenize_blocks of test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/dynamic_status_icon.py
[WARNING] (MainThread) Finished step1 on process_file_contents
[WARNING] (MainThread) Finished step2 on process_file_contents
[INFO] (MainThread) Successfully ran process_file_contents test-env/3demax-Take-a-break.zip/test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/dynamic_status_icon.py
[INFO] (MainThread) Attempting to process_file_contents test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/teatime.py
[INFO] (MainThread) Starting tokenize_blocks of test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/teatime.py
[WARNING] (MainThread) Finished step1 on process_file_contents
[WARNING] (MainThread) Finished step2 on process_file_contents
[INFO] (MainThread) Successfully ran process_file_contents test-env/3demax-Take-a-break.zip/test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/teatime.py
[INFO] (MainThread) Attempting to process_file_contents test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/timer.py
[INFO] (MainThread) Starting tokenize_blocks of test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/timer.py
[WARNING] (MainThread) Finished step1 on process_file_contents
[WARNING] (MainThread) Finished step2 on process_file_contents
[INFO] (MainThread) Successfully ran process_file_contents test-env/3demax-Take-a-break.zip/test-env/3demax-Take-a-break.zip/Take-a-break-master/examples/timer.py
[INFO] (MainThread) Attempting to process_file_contents test-env/3demax-Take-a-break.zip/Take-a-break-master/take-a-break.py
[INFO] (MainThread) Starting tokenize_blocks of test-env/3demax-Take-a-break.zip/Take-a-break-master/take-a-break.py
[WARNING] (MainThread) Finished step1 on process_file_contents
[WARNING] (MainThread) Finished step2 on process_file_contents
[INFO] (MainThread) Successfully ran process_file_contents test-env/3demax-Take-a-break.zip/test-env/3demax-Take-a-break.zip/Take-a-break-master/take-a-break.py
[INFO] (MainThread) Successfully ran process_zip_ball test-env/3demax-Take-a-break.zip
[INFO] (MainThread) Project finished <13,test-env/3demax-Take-a-break.zip> (process 0)
[INFO] (MainThread) (0): Total: 0:00:00.032250micros | Zip: 14999 Read: 254 Separators: 1666micros Tokens: 575micros Write: 251micros Hash: 29 regex: 632
[INFO] (MainThread) Process 0 finished. 6 files in 0s.
Process 0 finished, 6 files processed (3000006). Current total: 6
*** All done. 6 files in 0:00:00.069147
As I said, we don't support the block-level tokenizer anymore. If it breaks, you can keep the pieces :-)
Okay, thank you.
FWIW, I just updated the block-level tokenizer with a patch that should make it work for Python3 https://github.com/Mondego/SourcererCC/commit/c296871e3315533563e3060476f45ed5cbfbd083
Appreciate your help!
I set up a python 3.7 environment using conda and followed requirements.txt to set it up. I used the test-env and set config.ini to python. But I failed to run the block-level tokenizer. I kept getting warning:
join() argument must be str or bytes, not 'ZipInfo'
The output looks like this:
I even used diff.txt. I know that's not the solution. Is there someone who have met the same problem like me?