rapidstream-org / rapidstream-tapa

RapidStream-TAPA compiles task-parallel HLS program into high-frequency FPGA accelerators.
https://tapa.rtfd.io
MIT License
149 stars 30 forks source link

Batch stream function call parse error #78

Closed vkomenda closed 3 weeks ago

vkomenda commented 2 years ago

I'm getting and error "no port _dout" as follows.

I0426 18:08:59.777 tapa.core:269] instrumenting upper-level RTL
Traceback (most recent call last):
  File "/home/vk/.local/bin/tapac", line 33, in <module>
    sys.exit(load_entry_point('tapa', 'console_scripts', 'tapac')())
  File "/home/vk/src/blaok/tapa/backend/python/tapa/tapac.py", line 527, in main
    program.generate_task_rtl(
  File "/home/vk/src/blaok/tapa/backend/python/tapa/core.py", line 272, in generate_task_rtl
    self._instrument_non_top_upper_task(task, part_num, additional_fifo_pipelining)
  File "/home/vk/src/blaok/tapa/backend/python/tapa/core.py", line 814, in _instrument_non_top_upper_task
    self._connect_fifos(task)
  File "/home/vk/src/blaok/tapa/backend/python/tapa/core.py", line 398, in _connect_fifos
    wire_width = self.get_task(task_name).module.get_port_of(
  File "/home/vk/src/blaok/tapa/backend/python/tapa/verilog/xilinx/module.py", line 199, in get_port_of
    raise ValueError(f'module {self.name} does not have port {fifo}.{suffix}')
ValueError: module compute_kernel does not have port in_hashes._dout

The function signature is

void compute_kernel(tapa::istream<ap_uint<256>>& in_hashes,
                    tapa::istream<uint64_t>& num_iters,
                    tapa::ostream<ap_uint<256>> out_q);

and the call is

void compute(tapa::istreams<ap_uint<256>, NK>& in_hashes_qs,
             tapa::istreams<uint64_t, NK>& num_iters_qs,
             tapa::ostreams<ap_uint<256>, NK>& out_qs) {
    tapa::task().invoke<tapa::detach, NK>(compute_kernel, in_hashes_qs, num_iters_qs, out_qs);
}

What is wrong with this function?

Licheng-Guo commented 2 years ago

Could you share the entire file? For now, could you try renaming your args and avoid "out" and "in" in them and see if the errors are gone?

vkomenda commented 2 years ago

@Licheng-Guo here is the file https://github.com/vkomenda/varana/blob/routing-fix/src/tapa_poh/tapa_poh.cpp

vkomenda commented 2 years ago

I renamed the arguments as you suggested but the error is still there.

Licheng-Guo commented 2 years ago

I took a quick look, there is a missing & here: https://github.com/vkomenda/varana/blob/46434daada3689a2e4c9799b276ee5a705688f52/src/tapa_poh/tapa_poh.cpp#L272

vkomenda commented 2 years ago

Thanks. With this correction I have the following error:

I0426 21:39:09.823 tapa.core:269] instrumenting upper-level RTL
Traceback (most recent call last):
  File "/home/vk/.local/bin/tapac", line 33, in <module>
    sys.exit(load_entry_point('tapa', 'console_scripts', 'tapac')())
  File "/home/vk/src/blaok/tapa/backend/python/tapa/tapac.py", line 560, in main
    program.generate_top_rtl(
  File "/home/vk/src/blaok/tapa/backend/python/tapa/core.py", line 349, in generate_top_rtl
    self.top_task.module.register_level = get_slr_count(part_num)
  File "/home/vk/src/blaok/tapa/backend/python/tapa/hardware.py", line 108, in get_slr_count
    raise NotImplementedError('unknown part_num %s', part_num)
NotImplementedError: ('unknown part_num %s', 'xcu55n-fsvh2892-2L-e')

tapac with the same arguments used to work just a few days ago.

Licheng-Guo commented 2 years ago

This is a recent bug, can you try this branch for now? https://github.com/UCLA-VAST/tapa/tree/fix_register_level Will merge into the master later today.

vkomenda commented 2 years ago

The fix_register_level branch seems to work for me. I'm testing the generated code.

vkomenda commented 2 years ago

I've been trying to simulate the example but my test fails with double-free errors and non-termination.

$ ./tapa-poh 1
double free or corruption (out)
Aborted
$ ./tapa-poh 2
double free or corruption (out)
Aborted
$ ./tapa-poh 3
munmap_chunk(): invalid pointer
Aborted
$ ./tapa-poh 4
munmap_chunk(): invalid pointer
Aborted
$ ./tapa-poh 5
munmap_chunk(): invalid pointer
Aborted
$ ./tapa-poh 6
double free or corruption (out)
Aborted
$ ./tapa-poh 7
double free or corruption (out)
Aborted
$ ./tapa-poh 48

The last command doesn't terminate.

Could you check if there are apparent errors in my program? I tried to follow on the steps of the network and vadd examples but something doesn't work. Possibly closing streams in batches since this wasn't done in the network example.

Licheng-Guo commented 2 years ago

I do not notice any obvious issue, could you try change async_mmap back to mmap for now to test if they are incorrect?

Blaok commented 2 years ago

@vkomenda Most likely you have an out-of-bound access here. You can add -fsanitize=address -g to your build-host.sh and see it yourself.

image

vkomenda commented 2 years ago

@Licheng-Guo as @Blaok correctly indicated, the out-of-bounds access concerns the output mmap rather than internal async_mmaps. I'm getting non-deterministic behaviour in that mmap, with the output results appearing on different places in different runs.

vkomenda commented 2 years ago

Just one last question regarding the stream semantics while I'm trying to understand the source of this non-determinism. Does eot become true for an istream in one task right after another task calls close on that stream despite the fact there may still be unread elements in that stream? I'm possibly having this early termination issue in line https://github.com/vkomenda/varana/blob/a4b0c3965fd34f487d1fdcc61a1c81b56f40e2ba/src/tapa-poh/tapa-poh.cpp#L321.

Blaok commented 2 years ago

eot is just a special type of element. You won't be able to see eot until all elements that were written before eot have been consumed.

Are you sure you want i + j * NK instead of i * NK + j?

Blaok commented 2 years ago

Also, you might want to use n_elem % NK instead of iq here and here.

vkomenda commented 2 years ago

Also, you might want to use n_elem % NK instead of iq here and here.

Thanks for spotting! Noted and corrected.

Are you sure you want i + j * NK instead of i * NK + j?

I am indeed. This is for the j-th output from the i-th kernel. This means non-sequential writes and you might argue that I should transform this to sequential writes instead.

vkomenda commented 2 years ago

This is a recent bug, can you try this branch for now? https://github.com/UCLA-VAST/tapa/tree/fix_register_level Will merge into the master later today.

The error is still in the main branch currently.

The part_num %s is not supported for floorplanning.
Licheng-Guo commented 2 years ago

Hi @vkomenda, this is because you enabled floorplanning, and your part_num is not supported for floorplanning yet. You should remove --enable-floorplanning and --floorplan-output to skip the floorplanning step.

Licheng-Guo commented 2 years ago

@vkomenda BTW, in case you have not seen it, we are holding a hybrid tutorial at FCCM this month. You are welcome to join https://www.fccm.org/workshop-tutorial-2022/

vkomenda commented 2 years ago

Hi @vkomenda, this is because you enabled floorplanning, and your part_num is not supported for floorplanning yet. You should remove --enable-floorplanning and --floorplan-output to skip the floorplanning step.

Hmm. This is the command that's failing. Did you enable floorplanning by default?

$ tapac -o tapa-poh.$PLATFORM.hw.xo tapa-poh.cpp --platform ~/Xilinx/platforms/$PLATFORM --top poh --work-dir tapa-poh.$PLATFORM.hw.xo.tapa

Traceback (most recent call last):
  File "/home/vk/.local/bin/tapac", line 33, in <module>
    sys.exit(load_entry_point('tapa', 'console_scripts', 'tapac')())
  File "/home/vk/src/blaok/tapa/backend/python/tapa/tapac.py", line 446, in main
    all_steps, last_step = parse_steps(args, parser)
  File "/home/vk/src/blaok/tapa/backend/python/tapa/tapac.py", line 412, in parse_steps
    parser.error('The part_num %s is not supported for floorplanning. '
TypeError: ArgumentParser.error() takes 2 positional arguments but 3 were given
vkomenda commented 2 years ago

@vkomenda BTW, in case you have not seen it, we are holding a hybrid tutorial at FCCM this month. You are welcome to join https://www.fccm.org/workshop-tutorial-2022/

Too late to renew my vaccination status for flying in but I'll attend over Zoom!

Blaok commented 2 years ago

Hmm. This is the command that's failing. Did you enable floorplanning by default?

Yes, it looks like we still have bugs in the current logic. I'll send a fix shortly and add a platform that does not support floorplanning to cover this in tests.

Licheng-Guo commented 2 years ago

In this case floorplanning is supposed to be skipped because floorplan-output is not provided, I'll check what happened

On Sat, May 7, 2022 at 9:46 AM Blaok @.***> wrote:

Hmm. This is the command that's failing. Did you enable floorplanning by default?

Yes, it looks like we still have bugs in the current logic. @Licheng-Guo https://github.com/Licheng-Guo How about we skip floorplanning and log a warning if the platform does not support floorplanning, unless floorplanning is explicitly enabled? I'll add a platform that does not support floorplanning to cover this.

— Reply to this email directly, view it on GitHub https://github.com/UCLA-VAST/tapa/issues/78#issuecomment-1120239288, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHXOD253IGHYMOU4XFXKKGLVI2M5DANCNFSM5UMQWTKA . You are receiving this because you were mentioned.Message ID: @.***>

Licheng-Guo commented 2 years ago

@vkomenda fixed in this branch: fix_floorplan_check had a bug in the parser checking logic

vkomenda commented 2 years ago

Thanks for the fix. Now I get errors in csynth_design as in #89.