patnashev / prst

PRST is a primality testing utility written in C++.
5 stars 3 forks source link

Question! #3

Open PrestackI opened 10 months ago

PrestackI commented 10 months ago

Can you include a "stoponprimedk" like LLR?

I made this janky code in python that works well enough for my purposes but is definitely not portable.

import os
import re

# change to the directory with your file in it

path = [active directory]

# determines files to process
for fil_list in os.listdir(path):
    if fil_list.startswith("r"):

        with open(fil_list,"r+") as prpfile:
            word = "is prime!"
            data = prpfile.readlines()[-1]
            if word in data:
                datas = data.split()
                tar_t = datas[0]
                tar_s = tar_t.split("^")
                tar_f = tar_s[0]
                print(tar_f)
                for fil_list in os.listdir(path):
                    if fil_list.endswith("bat"):
                        with open(fil_list,"r") as batfile:
                            for line in batfile:
                                data1 = batfile.read()
                                term = re.compile(rf'\b{re.escape(tar_f)}\b', re.IGNORECASE)
                                update = term.sub(lambda match: match.group().replace(tar_f, "pass"), data1)
                            with open(fil_list, "w") as batfile:
                                batfile.write(update)
                                prpfile.write("Skip \n")
            else:
                print("is not prime.")
happy5214 commented 4 months ago

I would not use that code as an example for a C++ implementation. It looks horrific, and I'm surprised it works for anyone at all.

  1. Use 4 spaces, not tabs. That's the Python way. (That's the least of the issues.)
  2. Use a regex to extract k instead of split.
  3. You're clobbering the batch file as you're reading from it, which is terrible practice.
  4. I don't see any assignment of update other than the last actually being written, as the lines are written after the for loop finishes.
  5. Don't shadow file handle variables like you did with batfile. Use different names or outdent the output with block to the same level as the input with block.
  6. for line in batfile (which iterates through each line in batfile) is undercut by the batfile.read() call (which reads in the rest of the file. Use one or the other.

I'm sure there may be other issues.

You're probably better off looking at LLR for the functionality of this parameter.

PrestackI commented 4 months ago

Deleted my previous comment since I broke it after I sent it.

Below it is fixed.

import os
import re

path = [active directory]

for fil_list in os.listdir(path):
    if fil_list.startswith("r"):
        with open(fil_list,"r+") as prpfile:
            word = "is a prime"
            data = prpfile.readlines()[-1]
            if word in data:
                datain = re.search(r"(\d+)\*(\d+)", data)
                dataout = datain.group()
                print(dataout)
                prpfile.write("Skip \n") # if word is found writes skip in result file
                for bat_list in os.listdir(path):
                    if bat_list.endswith("bat"): 
                        with open(bat_list,"r") as batfile:
                            batin = batfile.readlines()
                        with open(bat_list,"w") as batfile:
                            for line in batin:
                                if line.find(dataout) == -1:
                                    batfile.write(line)
            else:
                print("is not prime.")

Thanks for your comments!

Edit 1: Removes line instead of adding pass.

happy5214 commented 4 months ago

I thought this task was dead, to be completely honest. Since you responded so quickly (and literally edited the last post as I was typing this), I'll add that you should probably stuff dataout in a list (a new global variable after the path declaration) and loop through the batch files in a separate for loop after you've gathered all the k/b pairs from the result files. (This also only uses the last line of the results file; I assume that's intentional.) Structuring the loops this way makes the runtime linear instead of quadratic, at the cost of a very slight increase in memory consumption.

Edit: It actually only improves the time complexity if you're comparing multiple k/b pairs (which you probably should unless you're running this after every test), though it does save opening a bunch of file handles. It still does markedly improve the indentation. You should also use if dataout not in line: instead of if line.find(dataout) == -1:, as the value is meaningless.

An example of what that second for loop could be is:

for bat_list in os.listdir(path):
    if bat_list.endswith("bat"): 
        with open(bat_list,"r") as batfile:
            batin = batfile.readlines()
        with open(bat_list,"w") as batfile:
            for line in batin:
                if all(map(lambda k_base: k_base not in line, k_base_data)):
                    batfile.write(line)
PrestackI commented 4 months ago

Yes, reading only the last line of the result file was intentional as I am only testing to see if the latest value was prime/probable prime. If not prime, I want it to continue testing the range.

Yes, this function runs after every test of PRST - which was leaving a bunch of extra file handles being called in the updated bat file.

I fixed the extra file handle problem by taking your suggestion to use if dataout not in line:. Then adding a global variable called skipper = "python SkipR_v5.py\n" (the python file), adding extra logic to the if statement (and skipper not in line), and finally writing the file call again (batfile.write(skipper)).

Thank you so much for your comments! They have been very helpful and I learned quite a bit this weekend. These improvements will help me finish the smaller n ranges for CRUS faster now!

import os
import re

path = [active directory]
skipper = "python SkipR_v5.py\n"

for fil_list in os.listdir(path):
    if fil_list.startswith("r"):
        with open(fil_list,"r+") as prpfile:
            word = "is a probable prime"
            data = prpfile.readlines()[-1]
            if word in data:
                datain = re.search(r"(\d+)\*(\d+)", data)
                dataout = datain.group()
                print(dataout)
                prpfile.write("Skip \n")
                for bat_list in os.listdir(path):
                    if bat_list.endswith("bat"):
                        with open(bat_list,"r") as batfile:
                            batin = batfile.readlines()
                        with open(bat_list,"w") as batfile:
                            for line in batin:
                                if dataout not in line and skipper not in line:
                                    batfile.write(line)
                                    batfile.write(skipper)
            else:
                print("is not prime.")

Edit 1: removing the lines messed with the position of live bat - going to try adding new lines.